Hi Team, Could you please help us to find the failed notebook in parallel run using the below code, but not able to find the failed one. Eg: If any notebook got failed it is stopping the processs, but we don't know which notebook failed , can you please help us to provide code to find which notebook failed and how to print it. # run multiple notebooks with parameters DAG = { "activities": [ {"name": "Notebook1", "path": "notebook1", "timeoutPerCellInSeconds": 120}, {"name": "notebbok2", "path": "notebbok2", "timeoutPerCellInSeconds": 120}, {"name": "notebook3", "path": "notebook3", "timeoutPerCellInSeconds": 120}, {"name": "notebbok4", "path": "notebbok4", "timeoutPerCellInSeconds": 120}, ] } try: mssparkutils.notebook.runMultiple(DAG) except: print(DAG.name)

Hi @SaiSekhar, MahasivaRavi (Philadelphia) Welcome to Microsoft Q&A platform and thanks for posting your query here. Based on your error, the method mssparkutils.notebook.runMultiple() allows you to run multiple notebooks in parallel or with a predefined topological structure. The API is using a multi-thread implementation mechanism within a spark session, which means the compute resources are shared by the reference notebook runs.I tried to run the mssparkutils.notebook.help("runMultiple") from our end and able to execute without any issues. Here is the status view of notebook run: Notebook1 Here is the status view of notebook run: Notebook2 In the above example both the notebooks named( Notebook1 and Notebook2 ) ran using the same Apache spark application named Livy ID 12 To find the failed notebooks when using the mssparkutils.notebook.runMultiple function, you can inspect the associated snapshots for details on the failures. The error message you received indicates that multiple runs failed, and it suggests checking the snapshots for more information. Additionally, you can implement a try-except block around your runMultiple call. In the except block, you can retrieve the results by calling .result on the exception. This will help you gather more information about which specific notebooks encountered issues. Here's an example of how you might structure your code: try: mssparkutils.notebook.runMultiple(DAG) except Exception as e: print(e.result) # This will give you details about the failed notebooks Make sure to handle the exception properly to avoid the AttributeError you encountered, which suggests that you're trying to access a property on a dictionary that doesn't exist. For more details, you can refer to Microsoft's official documentation on running parallel notebooks: Azure Databricks Parallel Notebooks Guide Hope this helps. Do let us know if you any further queries. If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Using this mssparkutils.notebook.runMultiple function to run notebook in parallel, how to find the failed notebook.

Accepted answer

Smaran Thoomu 19,880 Reputation points Microsoft Vendor

2025-01-29T15:05:31.6566667+00:00
Hi @SaiSekhar, MahasivaRavi (Philadelphia)
Welcome to Microsoft Q&A platform and thanks for posting your query here.

Based on your error, the method mssparkutils.notebook.runMultiple() allows you to run multiple notebooks in parallel or with a predefined topological structure. The API is using a multi-thread implementation mechanism within a spark session, which means the compute resources are shared by the reference notebook runs.I tried to run the mssparkutils.notebook.help("runMultiple") from our end and able to execute without any issues.

Here is the status view of notebook run: Notebook1

Here is the status view of notebook run: Notebook2

In the above example both the notebooks named(Notebook1 and Notebook2) ran using the same Apache spark application named Livy ID 12

To find the failed notebooks when using the mssparkutils.notebook.runMultiple function, you can inspect the associated snapshots for details on the failures. The error message you received indicates that multiple runs failed, and it suggests checking the snapshots for more information.

Additionally, you can implement a try-except block around your runMultiple call. In the except block, you can retrieve the results by calling .result on the exception. This will help you gather more information about which specific notebooks encountered issues.

Here's an example of how you might structure your code:

try: mssparkutils.notebook.runMultiple(DAG) except Exception as e: print(e.result) # This will give you details about the failed notebooks

Make sure to handle the exception properly to avoid the AttributeError you encountered, which suggests that you're trying to access a property on a dictionary that doesn't exist.

For more details, you can refer to Microsoft's official documentation on running parallel notebooks: Azure Databricks Parallel Notebooks Guide

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.
Please sign in to rate this answer.
SaiSekhar, MahasivaRavi (Philadelphia) 100 Reputation points

2025-01-29T15:20:20.97+00:00

Hi @Smaran Thoomu ,
Thanks for the update,

I'm asking that if n2 got failed, we have to know without opening the notebook can you please provide the script for that.
Eg: we have process all the notebooks, but I to have display if notebook n2 got failed and the execution proceed to next cells as well.
In the below 4 notebooks if 2 notebooks were failed we have to print the failed notebooks.

DAG = { "activities": [ # {"name": "n1", "path": "n1", "timeoutPerCellInSeconds": 120}, # {"name": "D_TO_X_SASTO_CONTRAT", "path": "D_TO_X_SASTO_CONTRAT", "timeoutPerCellInSeconds": 120}, # {"name": "n2", "path": "n2", "timeoutPerCellInSeconds": 120}, # {"name": "n3", "path": "n3", "timeoutPerCellInSeconds": 120}, # {"name": "n4", "path": "n4", "timeoutPerCellInSeconds": 120}, ] } # try: # mssparkutils.notebook.runMultiple(DAG) # except Exception as e: # print(e.result)

SaiSekhar, MahasivaRavi (Philadelphia) 100 Reputation points

2025-01-29T15:44:31.1+00:00

Hi @Smaran Thoomu ,
Thanks for the update,

Can you please provide the code to the below scenario,
Eg: There are 4 notebook we are running in parallel, 2 notebooks got failed, we have identify them with out opening the execution notebooks, we have print the 2 failed notebooks in the code, can you please help us on this.

we are using the below code but not able to get the correct output.
In the script itself we were able to identify failed notebook and it should execute the other cells.

DAG = {

"activities": [

{"name": "n1", "path": "n1", "timeoutPerCellInSeconds": 120},

{"name": "n2", "path": "n2", "timeoutPerCellInSeconds": 120},

{"name": "n3", "path": "n3", "timeoutPerCellInSeconds": 120},

{"name": "n4", "path": "n4", "timeoutPerCellInSeconds": 120}

]

}

try:

mssparkutils.notebook.runMultiple(DAG)

except exception as e:

print(e.result)

Smaran Thoomu 19,880 Reputation points Microsoft Vendor

2025-01-29T18:49:45.4833333+00:00

Hi SaiSekhar, MahasivaRavi (Philadelphia)

In the scenario where you are running 4 notebooks in parallel and need to identify the failed notebooks without opening them and print their names, please try the following code and let me know if it works.

from mssparkutils import notebook DAG = { "activities": [ {"name": "n1", "path": "n1", "timeoutPerCellInSeconds": 120}, {"name": "n2", "path": "n2", "timeoutPerCellInSeconds": 120}, {"name": "n3", "path": "n3", "timeoutPerCellInSeconds": 120}, {"name": "n4", "path": "n4", "timeoutPerCellInSeconds": 120}, ] } failed_notebooks = [] try: # Attempt to run multiple notebooks in parallel notebook.runMultiple(DAG) except Exception as e: # Capture failed notebook names print(f"RunMultipleFailedException: {str(e)}") failed_notebook_details = getattr(e, "result", {}) if failed_notebook_details: for activity in DAG["activities"]: if activity["name"] in failed_notebook_details: failed_notebooks.append(activity["name"]) # Print the failed notebooks if failed_notebooks: print("\nThe following notebooks failed:") for failed_notebook in failed_notebooks: print(f"- {failed_notebook}") else: print("\nAll notebooks executed successfully.")

How This Works:

Parallel Execution: All 4 notebooks (n1, n2, n3, n4) will execute simultaneously.

Error Logging: If any notebook fails, it is captured and printed (e.g., n2 and n3).

Sample Output (If n2 and n3 Fail):

diffCopyEditRunMultipleFailedException: Multiple runs failed: 2 notebook(s) encountered issues. The following notebooks failed: - n2 - n3

This approach allows you to quickly identify which notebooks failed without manually opening them, and the other notebooks will continue executing normally.

I hope this helps! Let me know if you have any further questions.

SaiSekhar, MahasivaRavi (Philadelphia) 100 Reputation points

2025-01-30T13:52:19.5233333+00:00

Hi @Smaran Thoomu ,
Thanks for the update, we have few questions on it can you please check and update to us.
timeoutPerCellInSeconds means is it notebook time out or notebook cell timeout,
can you please tell how to add notebook time out parameter.
is there any way can we pass variable instead of 120.
timeoutvalue = '120'
"timeoutPerCellInSeconds": timeoutvalue
Please share the code like how to handle this. Thank you!

DAG = { "activities": [ {"name": "n1", "path": "n1", "timeoutPerCellInSeconds": 120},
{"name": "n2", "path": "n2", "timeoutPerCellInSeconds": 120},
{"name": "n3", "path": "n3", "timeoutPerCellInSeconds": 120},
{"name": "n4", "path": "n4", "timeoutPerCellInSeconds": 120}, ] }

SaiSekhar, MahasivaRavi (Philadelphia) 100 Reputation points

2025-01-30T13:53:09.4166667+00:00

Hi @Smaran Thoomu ,
Thanks for the update, we have few questions on it can you please check and update to us.
timeoutPerCellInSeconds means is it notebook time out or notebook cell timeout,
can you please tell how to add notebook time out parameter.
is there any way can we pass variable instead of 120.
timeoutvalue = '120'
"timeoutPerCellInSeconds": timeoutvalue
Please share the code like how to handle this. Thank you!

DAG = { "activities": [ {"name": "n1", "path": "n1", "timeoutPerCellInSeconds": 120}, {"name": "n2", "path": "n2", "timeoutPerCellInSeconds": 120},
{"name": "n3", "path": "n3", "timeoutPerCellInSeconds": 120},
{"name": "n4", "path": "n4", "timeoutPerCellInSeconds": 120}, ] }

SaiSekhar, MahasivaRavi (Philadelphia) 100 Reputation points

2025-01-30T13:55:32.0366667+00:00

Hi @Smaran Thoomu ,
Thanks for the update, we have few questions on it can you please check and update to us.
timeoutPerCellInSeconds means is it notebook time out or notebook cell timeout,
can you please tell how to add notebook time out parameter.
is there any way can we pass variable instead of 120.
timeoutvalue = '120'
"timeoutPerCellInSeconds": timeoutvalue
Please share the code like how to handle this. Thank you!

DAG = { "activities": [ {"name": "n1", "path": "n1", "timeoutPerCellInSeconds": 120}, {"name": "n2", "path": "n2", "timeoutPerCellInSeconds": 120},
{"name": "n3", "path": "n3", "timeoutPerCellInSeconds": 120},
{"name": "n4", "path": "n4", "timeoutPerCellInSeconds": 120}, ] }

Smaran Thoomu 19,880 Reputation points Microsoft Vendor

2025-01-30T17:46:53.7633333+00:00

Hi @SaiSekhar, MahasivaRavi (Philadelphia)

Thanks for your questions! Happy to clarify them for you.

timeoutPerCellInSeconds - Notebook or Cell Timeout?

This parameter controls the timeout for each cell in the notebook, not the entire notebook execution. If a specific cell takes longer than this time, it will be forcefully stopped.

How to Set a Timeout for the Entire Notebook?

If you want to set a timeout for the entire notebook instead of individual cells, you can use the timeout parameter when triggering the notebook via mssparkutils.notebook.run(). Example:

mssparkutils.notebook.run("NotebookPath", timeout_seconds=300) # Timeout for the full notebook execution

Can we pass a variable instead of a hardcoded timeout?

Yes! You can absolutely store the timeout value in a variable and reference it dynamically in your DAG definition. Here’s how you can modify your code:

timeout_value = 120 # Define timeout as a variable DAG = { "activities": [ {"name": "n1", "path": "n1", "timeoutPerCellInSeconds": timeout_value}, {"name": "n2", "path": "n2", "timeoutPerCellInSeconds": timeout_value}, {"name": "n3", "path": "n3", "timeoutPerCellInSeconds": timeout_value}, {"name": "n4", "path": "n4", "timeoutPerCellInSeconds": timeout_value}, ] }

Just make sure the variable is an integer and not a string, as JSON expects a number here.

I hope this helps! Let me know if you have any further questions.

SaiSekhar, MahasivaRavi (Philadelphia) 100 Reputation points

2025-01-31T09:51:01.53+00:00

Hi @Smaran Thoomu

Is it possible to add whole notebook timeout parameter in DAG dict. can you please confirm on this requirement, is it possible to pass time_seconds(for whole notebook) in the below dict, how to pass ( whole notebook timeout seconds ) in parallel execution.

DAG = { "activities": [
{"name": "n1", "path": "n1", "timeoutPerCellInSeconds": timeout_value},
{"name": "n2", "path": "n2", "timeoutPerCellInSeconds": timeout_value},
{"name": "n3", "path": "n3", "timeoutPerCellInSeconds": timeout_value},
{"name": "n4", "path": "n4", "timeoutPerCellInSeconds": timeout_value}, ] }

Smaran Thoomu 19,880 Reputation points Microsoft Vendor

2025-02-03T06:04:31.8433333+00:00

Hi @SaiSekhar, MahasivaRavi (Philadelphia)
Thank you for your follow-up question! Let me clarify your query regarding adding a whole notebook timeout parameter in the DAG dictionary for parallel execution.

Is it possible to add a whole notebook timeout parameter in DAG?

Unfortunately, the mssparkutils.notebook.runMultiple function does not support a whole notebook timeout parameter directly within the DAG dictionary. The timeoutPerCellInSeconds parameter is specifically designed to set a timeout for individual cells within each notebook, not for the entire notebook execution.

Workaround Set Timeout When Calling Notebooks

Instead of using timeoutPerCellInSeconds, you can trigger each notebook using mssparkutils.notebook.run() within runMultiple(), passing the timeout in timeout_seconds for each notebook.

from mssparkutils import notebook DAG = { "activities": [ {"name": "n1", "path": "n1", "timeout_seconds": 300}, # Set 5 min timeout per notebook {"name": "n2", "path": "n2", "timeout_seconds": 300}, {"name": "n3", "path": "n3", "timeout_seconds": 300}, {"name": "n4", "path": "n4", "timeout_seconds": 300}, ] } try: notebook.runMultiple(DAG) except Exception as e: print(f"Error occurred: {e}")

This will apply a timeout at the notebook level rather than per cell. I hope this helps! Let me know if you have any further questions.
Sign in to comment

Use comments to ask for clarification, additional information, or improvements to the question.

Share via

Using this mssparkutils.notebook.runMultiple function to run notebook in parallel, how to find the failed notebook.

{"name": "n1", "path": "n1", "timeoutPerCellInSeconds": 120},

{"name": "n2", "path": "n2", "timeoutPerCellInSeconds": 120},

{"name": "n3", "path": "n3", "timeoutPerCellInSeconds": 120},

0 additional answers

Your answer