Using Google API To Get A Google Drive Directory

by Salome Grasland

Accessing a directory on Google Drive is crucial for a range of tasks, especially in collaborative and remote working environments. For example, you might need to retrieve file paths for hundreds of PDFs to incorporate them into a dashboard, or you may require access to multiple Excel sheets for consolidating financial reports. While navigating directories on Google Drive can be complex, leveraging the Google Drive API with Python allows you to efficiently extract and manage the necessary information.

The steps below elaborate on how to return a list of file paths from a google drive folder. 

Select the Google Drive API

  • Configure the OAuth consent screen.
  • Once you hit create it’ll take you to form, make sure to use the email address associated with your google drive account! 
  • Make sure to add yourself as a test user. 

  • Create credentials (OAuth 2.0 Client IDs) and download the credentials.json file.
  • Place your credentials.json file in your working directory or specify its path in your code.
  • Once you have completed these steps use the code below to return a directory of files in your google folder 
def get_credentials():
    """Retrieve Google API credentials."""
    # Path to your service account key file
    key_file_location = 'INSERT_YOUR_SERVICE_ACCOUNT_KEY_FILE_PATH_HERE'
    scopes = ['https://www.googleapis.com/auth/drive']
    credentials = ServiceAccountCredentials.from_json_keyfile_name(key_file_location, scopes=scopes)
    return credentials

def main():
    """Main function to list all files in a specified Google Drive folder and save the output."""
    creds = get_credentials()
    service = build('drive', 'v3', credentials=creds)

    # Specify the folder ID
    folder_id = 'INSERT_YOUR_FOLDER_ID_HERE'
    
    # Query to search for files in the specified folder
    query = f"'{folder_id}' in parents"

    # Prepare to handle pagination
    nextPageToken = None
    all_items = []

    # Define the path for the output file
    output_file_path = 'INSERT_YOUR_OUTPUT_FILE_PATH_HERE'

    # Continue to fetch data until there are no more pages
    while True:
        results = service.files().list(
            q=query,
            spaces='drive',
            fields='nextPageToken, files(id, name, webViewLink)',
            pageSize=100,  # Adjust pageSize if needed
            pageToken=nextPageToken
        ).execute()

        items = results.get('files', [])
        all_items.extend(items)
        nextPageToken = results.get('nextPageToken', None)

        if not nextPageToken:
            break

    # Open a file to write the output
    with open(output_file_path, 'w') as file:
        if not all_items:
            print('No files found.')
            file.write('No files found.\n')
        else:
            print('Files and URLs:')
            file.write('Files and URLs:\n')
            for item in all_items:
                output_line = f"{item['name']} ({item['webViewLink']})\n"
                print(output_line, end='')
                file.write(output_line)

if __name__ == '__main__':
    main()