Accessing a directory on Google Drive is crucial for a range of tasks, especially in collaborative and remote working environments. For example, you might need to retrieve file paths for hundreds of PDFs to incorporate them into a dashboard, or you may require access to multiple Excel sheets for consolidating financial reports. While navigating directories on Google Drive can be complex, leveraging the Google Drive API with Python allows you to efficiently extract and manage the necessary information.
The steps below elaborate on how to return a list of file paths from a google drive folder.
- Create a project in the Google Cloud Console.
- Enable the Google Drive API for your project.
Select the Google Drive API
- Configure the OAuth consent screen.
- Once you hit create it’ll take you to form, make sure to use the email address associated with your google drive account!
- Make sure to add yourself as a test user.
- Create credentials (OAuth 2.0 Client IDs) and download the credentials.json file.
- Place your credentials.json file in your working directory or specify its path in your code.
- Once you have completed these steps use the code below to return a directory of files in your google folder
def get_credentials():
"""Retrieve Google API credentials."""
# Path to your service account key file
key_file_location = 'INSERT_YOUR_SERVICE_ACCOUNT_KEY_FILE_PATH_HERE'
scopes = ['https://www.googleapis.com/auth/drive']
credentials = ServiceAccountCredentials.from_json_keyfile_name(key_file_location, scopes=scopes)
return credentials
def main():
"""Main function to list all files in a specified Google Drive folder and save the output."""
creds = get_credentials()
service = build('drive', 'v3', credentials=creds)
# Specify the folder ID
folder_id = 'INSERT_YOUR_FOLDER_ID_HERE'
# Query to search for files in the specified folder
query = f"'{folder_id}' in parents"
# Prepare to handle pagination
nextPageToken = None
all_items = []
# Define the path for the output file
output_file_path = 'INSERT_YOUR_OUTPUT_FILE_PATH_HERE'
# Continue to fetch data until there are no more pages
while True:
results = service.files().list(
q=query,
spaces='drive',
fields='nextPageToken, files(id, name, webViewLink)',
pageSize=100, # Adjust pageSize if needed
pageToken=nextPageToken
).execute()
items = results.get('files', [])
all_items.extend(items)
nextPageToken = results.get('nextPageToken', None)
if not nextPageToken:
break
# Open a file to write the output
with open(output_file_path, 'w') as file:
if not all_items:
print('No files found.')
file.write('No files found.\n')
else:
print('Files and URLs:')
file.write('Files and URLs:\n')
for item in all_items:
output_line = f"{item['name']} ({item['webViewLink']})\n"
print(output_line, end='')
file.write(output_line)
if __name__ == '__main__':
main()