Cannot index my data (Data ingestion failed

Talha 0 Reputation points
2024-12-25T13:01:48.31+00:00

When i try to index my data using AI search, the Step 1 (cracking and chunking)works successful, but Step 2 (Creating Azure AI Search Index) fails with the error data ingestion failed. I get the following logs
I am the owner i have the owner permission everywhere so how can it be permission issue. How to fix this

embeddings=azureml://subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourcegroups/CTRL-AI/workspaces/ctrl-ai_app/datastores/workspaceblobstore/paths/azureml/e787ad6c-ebe3-4c04-98da-4d4f410256bd/embeddings/
acs_config={"index_name":"funny-yam-6mr20yhcsn"}
connection_id=None
output=/mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/cap/data-capability/wd/index
verbosity=1
[2024-12-25 12:08:21] INFO     azureml.rag.update_acs - Reading connection id from environment variable (update_acs.py:495)
[2024-12-25 12:08:22] INFO     azureml.rag.update_acs.update_acs - ActivityStarted, update_acs (activity.py:108)
[2024-12-25 12:08:22] INFO     azureml.rag.connections - Getting workspace connection: ctrlsearchservice, with input credential: <class 'NoneType'>. (connections.py:332)
[2024-12-25 12:08:22] INFO     azureml.rag.connections - Getting credential from AzureMLTokenAuthentication._initialize_aml_token_auth (connections.py:341)
[2024-12-25 12:08:22] INFO     azureml.rag.connections - Getting workspace connection via MLClient with auth: <class 'azureml.dataprep.api._aml_auth._azureml_token_authentication.AzureMLTokenAuthentication'>, subscription_id: ea612f4e-d54f-49b7-8303-01676b4a8d39, resource_group_name: CTRL-AI, workspace_name: ctrl-ai_app. (connections.py:348)
Method connections: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
[2024-12-25 12:08:22] INFO     azureml.rag.connections - Using ml_client base_url: https://australiaeast.api.azureml.ms/rp/workspaces, original_base_url: https://management.azure.com. (connections.py:367)
[2024-12-25 12:08:23] INFO     azureml.rag.connections - Parsed Connection: /subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourceGroups/CTRL-AI/providers/Microsoft.MachineLearningServices/workspaces/ctrl-ai_app/connections/ctrlsearchservice (connections.py:386)
[2024-12-25 12:08:23] INFO     azureml.rag.connections - Got connection: /subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourceGroups/CTRL-AI/providers/Microsoft.MachineLearningServices/workspaces/ctrl-ai_app/connections/ctrlsearchservice as <class 'azure.ai.ml.entities._workspace.connections.connection_subtypes.AzureAISearchConnection'>. (connections.py:443)
[2024-12-25 12:08:23] INFO     azureml.rag.update_acs - got embeddings uri as input: azureml://subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourcegroups/CTRL-AI/workspaces/ctrl-ai_app/datastores/workspaceblobstore/paths/azureml/e787ad6c-ebe3-4c04-98da-4d4f410256bd/embeddings/ (update_acs.py:407)
[2024-12-25 12:08:23] INFO     azureml.rag.update_acs - extracted embeddings directory name: embeddings (update_acs.py:410)
[2024-12-25 12:08:23] INFO     azureml.rag.update_acs - extracted embeddings container path: azureml://subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourcegroups/CTRL-AI/workspaces/ctrl-ai_app/datastores/workspaceblobstore/paths/azureml/e787ad6c-ebe3-4c04-98da-4d4f410256bd/ (update_acs.py:412)
[2024-12-25 12:08:23] INFO     azureml.rag.update_acs - mounting embeddings container from: 
azureml://subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourcegroups/CTRL-AI/workspaces/ctrl-ai_app/datastores/workspaceblobstore/paths/azureml/e787ad6c-ebe3-4c04-98da-4d4f410256bd/ 
   to: 
/mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount (update_acs.py:417)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - loading embeddings from: /mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount/embeddings (__init__.py:573)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Getting workspace connection: ctrlai3678883812_aoai, with input credential: <class 'NoneType'>. (connections.py:332)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Getting credential from AzureMLTokenAuthentication._initialize_aml_token_auth (connections.py:341)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Getting workspace connection via MLClient with auth: <class 'azureml.dataprep.api._aml_auth._azureml_token_authentication.AzureMLTokenAuthentication'>, subscription_id: ea612f4e-d54f-49b7-8303-01676b4a8d39, resource_group_name: CTRL-AI, workspace_name: ctrl-ai_app. (connections.py:348)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Using ml_client base_url: https://australiaeast.api.azureml.ms/rp/workspaces, original_base_url: https://management.azure.com. (connections.py:367)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Parsed Connection: /subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourceGroups/CTRL-AI/providers/Microsoft.MachineLearningServices/workspaces/ctrl-ai_app/connections/ctrlai3678883812_aoai (connections.py:386)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Got connection: /subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourceGroups/CTRL-AI/providers/Microsoft.MachineLearningServices/workspaces/ctrl-ai_app/connections/ctrlai3678883812_aoai as <class 'azure.ai.ml.entities._workspace.connections.connection_subtypes.AzureOpenAIConnection'>. (connections.py:443)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - The connection 'ctrlai3678883812_aoai' is a <class 'azure.ai.ml.entities._workspace.connections.connection_subtypes.AzureOpenAIConnection'> with api_key auth type. (connections.py:184)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - loading embeddings from file format version: 2 (__init__.py:590)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - found following sources partitions: [PosixPath('/mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount/embeddings/sources.parquet')] (__init__.py:651)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - processing partition: /mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount/embeddings/sources.parquet (__init__.py:653)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - found following deleted sources partitions: [] (__init__.py:673)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - found following deleted document partitions: [] (__init__.py:695)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - found following embedding partitions: [PosixPath('/mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount/embeddings/embeddings.parquet'), PosixPath('/mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount/embeddings/embeddings_0.parquet')] (__init__.py:721)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - processing partition: /mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount/embeddings/embeddings.parquet (__init__.py:723)
[2024-12-25 12:08:24] INFO     azureml.rag.azureml.rag.embeddings - processing partition: /mnt/azureml/cr/j/9b55848337284fe58ab16f3db95cf15a/exe/wd/embeddings_mount/embeddings/embeddings_0.parquet (__init__.py:723)
[2024-12-25 12:08:24] INFO     azureml.rag.update_acs.update_acs - ActivityStarted, update_acs (activity.py:108)
[2024-12-25 12:08:24] INFO     azureml.rag.update_acs - Updating ACS index (update_acs.py:294)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Getting workspace connection: ctrlsearchservice, with input credential: <class 'NoneType'>. (connections.py:332)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Getting credential from AzureMLTokenAuthentication._initialize_aml_token_auth (connections.py:341)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Getting workspace connection via MLClient with auth: <class 'azureml.dataprep.api._aml_auth._azureml_token_authentication.AzureMLTokenAuthentication'>, subscription_id: ea612f4e-d54f-49b7-8303-01676b4a8d39, resource_group_name: CTRL-AI, workspace_name: ctrl-ai_app. (connections.py:348)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Using ml_client base_url: https://australiaeast.api.azureml.ms/rp/workspaces, original_base_url: https://management.azure.com. (connections.py:367)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Parsed Connection: /subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourceGroups/CTRL-AI/providers/Microsoft.MachineLearningServices/workspaces/ctrl-ai_app/connections/ctrlsearchservice (connections.py:386)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - Got connection: /subscriptions/ea612f4e-d54f-49b7-8303-01676b4a8d39/resourceGroups/CTRL-AI/providers/Microsoft.MachineLearningServices/workspaces/ctrl-ai_app/connections/ctrlsearchservice as <class 'azure.ai.ml.entities._workspace.connections.connection_subtypes.AzureAISearchConnection'>. (connections.py:443)
[2024-12-25 12:08:24] INFO     azureml.rag.connections - The connection 'ctrlsearchservice' is a <class 'azure.ai.ml.entities._workspace.connections.connection_subtypes.AzureAISearchConnection'> with api_key auth type. (connections.py:184)
[2024-12-25 12:08:24] INFO     azureml.rag.update_acs - Using Index fields: {
  "content": "content",
  "url": "url",
  "filename": "filepath",
  "title": "title",
  "metadata": "meta_json_string",
  "embedding": "contentVector"
} (update_acs.py:309)
[2024-12-25 12:08:24] INFO     azureml.rag.update_acs - Ensuring search index funny-yam-6mr20yhcsn exists (update_acs.py:127)
[2024-12-25 12:08:24] ERROR    azureml.rag.update_acs.update_acs - ActivityCompleted: Activity=update_acs, HowEnded=Failure, Duration=156.07 [ms], Exception=HttpResponseError (activity.py:127)
[2024-12-25 12:08:25] ERROR    azureml.rag.update_acs - Failed to update ACS index (update_acs.py:429)
[2024-12-25 12:08:25] ERROR    azureml.rag.update_acs.update_acs - ServiceError: intepreted error = Rag system error, original error = Operation returned an invalid status 'Forbidden' (exceptions.py:124)
[2024-12-25 12:08:30] ERROR    azureml.rag.update_acs.update_acs - update_acs failed with exception: Traceback (most recent call last):
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 465, in main_wrapper
    map_exceptions(main, activity_logger, args, logger, activity_logger)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/utils/exceptions.py", line 126, in map_exceptions
    raise e
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/utils/exceptions.py", line 118, in map_exceptions
    return func(*func_args, **kwargs)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 456, in main
    raise e
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 420, in main
    create_index_from_raw_embeddings(
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 311, in create_index_from_raw_embeddings
    create_search_index_sdk(acs_config, connection_credential, emb)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 131, in create_search_index_sdk
    if acs_config["index_name"] not in index_client.list_index_names():
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azure/core/paging.py", line 123, in __next__
    return next(self._page_iterator)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azure/core/paging.py", line 75, in __next__
    self._response = self._get_next(self.continuation_token)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azure/search/documents/indexes/_generated/operations/_indexes_operations.py", line 502, in get_next
    raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: Operation returned an invalid status 'Forbidden'
 (update_acs.py:467)
[2024-12-25 12:08:31] ERROR    azureml.rag.update_acs.update_acs - ActivityCompleted: Activity=update_acs, HowEnded=Failure, Duration=8765.49 [ms], Exception=HttpResponseError (activity.py:127)
Traceback (most recent call last):
  File "/azureml-envs/rag-embeddings/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/azureml-envs/rag-embeddings/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 499, in <module>
    main_wrapper(args, logger)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 465, in main_wrapper
    map_exceptions(main, activity_logger, args, logger, activity_logger)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/utils/exceptions.py", line 126, in map_exceptions
    raise e
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/utils/exceptions.py", line 118, in map_exceptions
    return func(*func_args, **kwargs)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 456, in main
    raise e
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 420, in main
    create_index_from_raw_embeddings(
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 311, in create_index_from_raw_embeddings
    create_search_index_sdk(acs_config, connection_credential, emb)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azureml/rag/tasks/update_acs.py", line 131, in create_search_index_sdk
    if acs_config["index_name"] not in index_client.list_index_names():
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azure/core/paging.py", line 123, in __next__
    return next(self._page_iterator)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azure/core/paging.py", line 75, in __next__
    self._response = self._get_next(self.continuation_token)
  File "/azureml-envs/rag-embeddings/lib/python3.9/site-packages/azure/search/documents/indexes/_generated/operations/_indexes_operations.py", line 502, in get_next
    raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: Operation returned an invalid status 'Forbidden'
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,119 questions
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.