Upload large files to SharePoint through python office365 API result in broken file

BrianC 40 Reputation points
2025-01-10T02:25:52.24+00:00

Hi all,

Recently I am using the python office365 API to upload files to SharePoint. I have a single CSV file that has a file size of 1GB.

Sometimes, a 503 Server Error occurs during the upload process. Other times, the upload seems successful without raising any errors, but when trying to read or download the file, it shows a file size of 0kB. Therefore, I am wonder if there is something wrong with my code.

def upload_file_to_sharepoint(self, local_relative_path, subfolder, filename, chunk_size=500000):
	try:
		local_full_path = os.getcwd() + "\\" + local_relative_path
		relative_url = self.data_folder + '/' + subfolder
		folder = self.ctx.web.get_folder_by_server_relative_url(relative_url)
		with open(local_full_path, "rb") as file_to_upload:
			folder.files.create_upload_session(
				file=file_to_upload, chunk_size=chunk_size, file_name=filename
			).execute_query()
		self.logger.info(f"{filename} has been uploaded successfully!")
	except Exception as e:
		self.logger.error(e)

Any assistance or insights into resolving this problem would be greatly appreciated. Thanks in advance.

SharePoint
SharePoint
A group of Microsoft Products and technologies used for sharing and managing content, knowledge, and applications.
11,120 questions
0 comments No comments
{count} votes

Accepted answer
  1. RaytheonXie_MSFT 38,036 Reputation points Microsoft Vendor
    2025-01-10T06:34:33.9633333+00:00

    Hi @BrianC,

    You will need to implement a function to recursively upload chunks using the Office365-rest-python-client to form the basis for the connection. Please refer to following code

    def sharepoint_upload_chunked(blob_path: Path, filename: str, sharepoint_folder: str, chunk_size: int):
        '''
            input:
            blob_path : path to binary file to upload
            filename : filename you want to name the upload
            sharepoint_folder : the name of the folder you want to upload to
            chunk_size : size of the chunks in bytes. Used in recursion.
        '''
        #log in
        ctx = ClientContext(URL).with_credentials(
                            ClientCredential(CLIENT_ID, CLIENT_SECRET))
        with open(blob_path, 'rb') as f:
            first_chunk = True
            size_previous_chunk = 0
            offset = 0
            filesize = os.path.getsize(blob_path)
            URL = https://myorg.sharepoint.com/sites/myapp
            #take url after "sites". You are already logged in to myorg.sharepoint.com via ctx (context)
            file_url = URL[29:] + f"/{sharepoint_folder}" + filename
            sharepoint_folder_long = url[29:] + f"/{sharepoint_folder}"
            #each upload needs a guid. You will reference this guid as you upload.
            upload_id = uuid.uuid4()
            #consume the data in chunks.
            while chunk := f.read(chunk_size):
                #see GitHub for progress bar code. This is for large uploads so it really helps.
                progressbar(offset, filesize, 30,'■')
                #start upload
                if first_chunk:
                    #you need to initialize an empty file to upload into.
                    print("adding empty file")
                    endpoint_url = f"{url}/_api/web/getfolderbyserverrelativeurl('{sharepoint_folder_long}')/files/add(url='{filename}', overwrite=true)"
                    upload_data(ctx, endpoint_url, bytes())
                    endpoint_url = f"{url}/_api/web/getfilebyserverrelativeurl('{file_url}')/startupload(uploadID=guid'{upload_id}')"
                    response = upload_data(ctx, endpoint_url, chunk)
                    first_chunk=False
                #Finish upload. if the current chunk is smaller than the previous chunk, it must be the last chunk. 
                elif len(chunk) < size_previous_chunk:
                    endpoint_url = f"{url}/_api/web/getfilebyserverrelativeurl('{file_url}')/finishupload(uploadID=guid'{upload_id}',fileOffset={offset})"
                    progressbar(filesize, filesize, 30,'■')
                    response = upload_data(ctx, endpoint_url, chunk)
                    print(response)
                #continue upload.
                else :
                    #continue to consume the chunks and upload.
                    endpoint_url = f"{url}/_api/web/getfilebyserverrelativeurl('{file_url}')/continueupload(uploadID=guid'{upload_id}',fileOffset={offset})"
                    response = upload_data(ctx, endpoint_url, chunk)
                #length in characters, not in bytes)
                size_previous_chunk = len(chunk)
                offset = offset + size_previous_chunk 
    

    You could get the full code in the following document

    https://github.com/SteveScott/office-365-python-rest-client-chunked-upload-example/blob/main/sharepoint_upload.py


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.