AI Search indexing has started failing on all deployments with a DNS error

Craig Pay 20 Reputation points
2025-01-15T10:39:28.5133333+00:00

I have 4 or 5 Azure AI deployments with private endpoints, no public access, all stable, all deployed with Terraform, all are tested the same way with a single PDF which usually indexes fine. I've built dozens of these things over the last few months, with multiple teardowns and redeploys.

Yesterday I deployed a new one, the indexing didn't work with a DNS error. I went back to check my other deployments from days or weeks ago, and they're all now failing indexing with the same error.

Is there an outage? I'm using Microsoft Managed VNET connectivity (AI Hub managed workspace).

Log...

Status messages

View all logs__ErrorUserError

Error Code: ScriptExecution.WriteStreams.ServerUnavailable
Native Error: Connection to destination failed.
	ConnectionFailure { source: Some(hyper::Error(Connect, ConnectError("dns error", Custom { kind: Uncategorized, error: "failed to lookup address information: Name or service not known" }))) }
=> error trying to connect: dns error: failed to lookup address information: Name or service not known
	hyper::Error(Connect, ConnectError("dns error", Custom { kind: Uncategorized, error: "failed to lookup address information: Name or service not known" }))
=> dns error: failed to lookup address information: Name or service not known
	ConnectError("dns error", Custom { kind: Uncategorized, error: "failed to lookup address information: Name or service not known" })
=> failed to lookup address information: Name or service not known
	Custom { kind: Uncategorized, error: "failed to lookup address information: Name or service not known" }
Error Message: Connection failed when trying to access the destination. error trying to connect: dns error: failed to lookup address information: Name or service not known| session_id=40f5760a-94fd-4d76-bcd0-899fd3aeef23

Warnings (1)

__Warning

AzureMLCompute job failed
data-capability.AssetUploadOutputSession.ExecutionError: [REDACTED]
	Reason: [REDACTED]
	StackTrace:   File "/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/data_capability/capability_session.py", line 123, in end
    session.end(commit)

  File "/opt/miniconda/envs/data-capability/lib/python3.9/site-packages/data_capability/data_sessions.py", line 1064, in end
    self._uri_upload_session.end(commit)

  File "/opt/miniconda/envs/data-capability/lib/pyth
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,166 questions
{count} votes

Accepted answer
  1. Laxman Reddy Revuri 1,995 Reputation points Microsoft Vendor
    2025-01-16T16:01:24.51+00:00

    Hi @Craig Pay
    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer
    Ask: AI Search indexing has started failing on all deployments with a DNS error
    Solution:
    OK, I think I know what's going on. Microsoft have changed how Workspace managed outbound access works for an AI Hub.

    Previously, the Hub would automatically create its own entries for Key Vault and Storage Account under 'Required outbound rules' alongside one for the machine learning resource. They have wacky names that start _SYS_PE[name of resource].

    At some point in the last 3 days, the Hub is behaving differently, needing these entries to be manually created for Key Vault and Storage Account.

    I swear I had this once before with Storage Account, and after a single round of manually creating them, well, using Terraform, I had to stop manually creating them again as they started reappearing again. I've no idea what Microsoft are doing with this weird managed Vnet of theirs. It's crazy!

    Anyway, the AI Hub in all of my deployments is no longer automatically creating entries in 'Required outbound rules' in the 'Workspace managed outbound access' for Key Vault or the Storage Account. They've literally vanished, leaving behind the private endpoint these entries subsequently create against the resource in question.

    I've manually added 'User-defined outbound rules' for Key Vault and Storage Account (blob and file), re-run an index and it worked.

    This isn't a DNS corruption issue or misconfiguration.

    The AI Hub is behaving differently. It must be a change applied by Microsoft behind the scenes.

    I suspect I'm only one of a very small number of people who are wiring up their Azure AI services this tightly, to use completely private networking: zero public access, effectively dual-homed between my own Vnet and the Microsoft managed Vnet. Hence the lack of other people talking about this issue out there.
    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members. 


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.