Malware scanning in Defender for Storage

3 minutes

Malware Scanning in Defender for Storage helps protect your Azure Blob Storage from malicious content by performing a full malware scan on uploaded content in near real-time, using Microsoft Defender Antivirus capabilities. It's designed to help fulfill security and compliance requirements for handling untrusted content.

The Malware Scanning capability is an agentless software as a service (SaaS) solution that allows simple setup at scale, with zero maintenance, and supports automating response at scale.

Diagram showing how Malware Scanning detects malicious files upon upload in near real-time.

Malware upload is a top threat on cloud storage

Content uploaded to cloud storage could be malware. Storage accounts can be a malware entry point into the organization and a malware distribution point. To protect organizations from this threat, content in cloud storage must be scanned for malware before it's accessed.

Malware scanning in Defender for Storage helps protect storage accounts from malicious content

A built-in SaaS solution that allows simple enabling at scale with zero maintenance.
Comprehensive antimalware capabilities using Microsoft Defender Antivirus (MDAV), catching polymorphic and metamorphic malware.
Every file type is scanned (including archives like zip files) and a result is returned for every scan. The file size limit is 2 GB.
Supports response at scale – deleting or quarantining suspicious files, based on the blobs’ index tags or Event Grid events.
When the malware scan identifies a malicious file, detailed Microsoft Defender for Cloud security alerts are generated.
Designed to help fulfill security and compliance requirements to scan untrusted content uploaded to storage, including an option to log every scan result.

Common use-cases and scenarios

Some common use-cases and scenarios for malware scanning in Defender for Storage include:

Web applications: many cloud web applications allow users to upload content to storage. This allows low maintenance and scalable storage for applications like tax apps, CV upload HR sites, and receipts upload.
Content protection: assets like videos and photos are commonly shared and distributed at scale both internally and to external parties. CDNs (Content Delivery Network) and content hubs are a classic malware distribution opportunity.
Compliance requirements: resources that adhere to compliance standards like National Institute of Standards and Technology, Society for Worldwide Interbank Financial Telecommunications, General Data Protection Regulation, and others require robust security practices, which include malware scanning. It's critical for organizations operating in regulated industries or regions.
Third-party integration: third-party data can come from a wide variety of sources, and not all of them might have robust security practices, such as business partners, developers, and contractors. Scanning for malware helps to ensure that this data doesn't introduce security risks to your system.
Collaborative platforms: similar to file sharing, teams use cloud storage for continuously sharing content and collaborating across teams and organizations. Scanning for malware ensures safe collaboration.
Data pipelines: data moving through ETL (Extract, Transfer, Load) processes can come from multiple sources and might include malware. Scanning for malware can help to ensure the integrity of these pipelines.
Machine learning training data: the quality and security of the training data are critical for effective machine learning models. It's important to ensure these data sets are clean and safe, especially if they include user-generated content or data from external sources.

Screenshot showing a common use case and scenario for malware scanning in Defender for Storage.

Malware scanning is a near real-time service. Scan times can vary depending on the scanned file size or file type as well as on the load on the service or on the storage account. Microsoft is constantly working on reducing the overall scan time, however you should take this variability in scan times into consideration when designing a user experience based the service.

Prerequisites

To enable and configure Malware Scanning, you must have Owner roles (such as Subscription Owner or Storage Account Owner) or specific roles with the necessary data actions. Learn more about the required permissions.

You can enable and configure Malware Scanning at scale for your subscriptions while maintaining granular control over configuring the feature for individual storage accounts. There are several ways to enable and configure Malware Scanning: Azure built-in policy (the recommended method), programmatically using Infrastructure as Code templates, including Terraform, Bicep, and Azure Resource Manager (ARM) templates, using the Azure portal, or directly with the REST API.

How does malware scanning work

On-upload malware scanning

On-upload triggers

When a blob is uploaded to a protected storage account - a malware scan is triggered. All upload methods trigger the scan. Modifying a blob is an upload operation and therefore the modified content is scanned after the update.

Scan regions and data retention

The malware scanning service that uses Microsoft Defender Antivirus technologies reads the blob. Malware Scanning scans the content "in-memory" and deletes scanned files immediately after scanning. The content isn't retained. The scanning occurs within the same region of the storage account. In some cases, when a file is suspicious, and more data is required, Malware Scanning might share file metadata outside the scanning region, including metadata classified as customer data (for example, Secure Hash Algorithm SHA-256 hash), with Microsoft Defender for Endpoint.

Access customer data

The Malware Scanning service requires access to your data to scan your data for malware. During service enablement, a new Data Scanner resource called StorageDataScanner is created in your Azure subscription. This resource is granted with a Storage Blob Data Owner role assignment to access and change your data for malware scanning and sensitive data discovery.

Private Endpoint is supported out-of-the-box

Malware scanning in Defender for Storage is supported in storage accounts that use private endpoints while maintaining data privacy.

Private endpoints provide secure connectivity to your Azure storage services, eliminating public internet exposure, and are considered a best practice.

Set up of malware scanning

When malware scanning is enabled, the following actions automatically take place in your environment:

For each storage account you enable malware scanning on, an Event Grid System Topic resource is created in the same resource group of the storage account - used by the malware scanning service to listen on blob upload triggers. Removing this resource breaks the malware scanning functionality.
To scan your data, the Malware Scanning service requires access to your data. During service enablement, a new Data Scanner resource called StorageDataScanner is created in your Azure subscription and assigned with a system-assigned managed identity. This resource is granted with the Storage Blob Data Owner role assignment permitting it to access your data for purposes of Malware Scanning and Sensitive Data Discovery.

If your storage account Networking configuration is set to Enable Public network access from selected virtual networks and IP addressed, the StorageDataScanner resource is added to the Resource instances section under storage account Networking configuration to allow access to scan your data.

If you're enabling malware scanning on the subscription level, a new Security Operator resource called StorageAccounts/securityOperators/DefenderForStorageSecurityOperator is created in your Azure subscription and assigned with a system-managed identity. This resource is used to enable and repair Defender for Storage and Malware Scanning configuration on existing storage accounts and check for new storage accounts created in the subscription to be enabled. This resource has role assignments that include the specific permissions needed to enable malware scanning.

Malware scanning depends on certain resources, identities, and networking settings to function properly. If you modify or delete any of these, malware scanning will stop working. To restore its normal operation, you can turn it off and on again.

Providing scan results

Malware scanning scan results are available through four methods. After setup, you'll see scan results as blob index tags for every uploaded and scanned file in the storage account, and as Microsoft Defender for Cloud security alerts when a file is identified as malicious.

You might choose to configure extra scan result methods, such as Event Grid and Log Analytics; these methods require extra configuration. In the next section, you'll learn about the different scan result methods.

Diagram showing an example of how to view and consume scan results.

Scan results

Blob index tags

Blob index tags are metadata fields on a blob. They categorize data in your storage account using key-value tag attributes. These tags are automatically indexed and exposed as a searchable multi-dimensional index to easily find data. The scan results are concise, displaying Malware Scanning scan result and malware scanning scan time Coordinated Universal Time (UTC) in the blob metadata. Other result types (alerts, events, logs) provide more information on the malware type and file upload operation.

Screenshot showing Malware Scanning results and malware scanning Coordinated Universal Time in the blob metadata.

Blob index tags can be used by applications to automate workflows, but aren't tamper-resistant.

Defender for Cloud security alerts

When a malicious file is detected, Microsoft Defender for Cloud generates a Microsoft Defender for Cloud security alert. To see the alert, go to Microsoft Defender for Cloud security alerts. The security alert contains details and context on the file, the malware type, and recommended investigation and remediation steps. To use these alerts for remediation, you can:

View security alerts in the Azure portal by navigating to Microsoft Defender for Cloud > Security alerts.
Configure automations based on these alerts.
Export security alerts to a security information management (SIEM). You can continuously export security alerts Microsoft Sentinel (Microsoft’s SIEM) using Microsoft Sentinel connector, or another SIEM of your choice.

Event Grid event

Event Grid is useful for event-driven automation. It's the fastest method to get results with minimum latency in a form of events that you can use for automating response.

Events from Event Grid custom topics can be consumed by multiple endpoint types. The most useful for malware scanning scenarios are:

Function App (previously called Azure Function) – use a serverless function to run code for automated response like move, delete or quarantine.
Webhook – to connect an application.
Event Hubs & Service Bus Queue – to notify downstream consumers.

Logs analytics

You might want to log your scan results for compliance evidence or investigating scan results. By setting up a Log Analytics Workspace destination, you can store every scan result in a centralized log repository that is easy to query. You can view the results by navigating to the Log Analytics destination workspace and looking for the StorageMalwareScanningResults table.

Cost control

Malware scanning is billed per GB scanned. To provide cost predictability, Malware Scanning supports setting a cap on the amount of GB scanned in a single month per storage account.

Malware scanning in Defender for Storage is not included for free in the first 30-day trial and will be charged from the first day in accordance with the pricing scheme available on the Defender for Cloud pricing page.

The "capping" mechanism is designed to set a monthly scanning limit, measured in gigabytes (GB), for each storage account, serving as an effective cost control. If a predefined scanning limit is established for a storage account in a single calendar month, the scanning operation would automatically halt once this threshold is reached (with up to a 20-GB deviation), and files wouldn't be scanned for malware. The cap is reset at the end of every month at midnight UTC. Updating the cap typically takes up to an hour to take effect.

By default, a limit of 5 TB (5,000 GB) is established if no specific capping mechanism is defined.

You can set the capping mechanism on either individual storage accounts or across an entire subscription (every storage account on the subscription will be allocated the limit defined on the subscription level).

Additional costs of malware scanning

Malware scanning uses other Azure services as its foundation. This means that when you enable Malware scanning, you will also be charged for the Azure services that it requires. These services include Azure Storage read operations, Azure Storage blob indexing and Azure Event Grid notifications.

Handling possible false positives and false negatives

If you have a file that you suspect might be malware but isn't being detected (false negative) or is being incorrectly detected (false positive), you can submit it to us for analysis through the sample submission portal. Select “Microsoft Defender for Storage” as the source.

Defender for Cloud allows you to suppress false positive alerts. Make sure to limit the suppression rule by using the malware name or file hash.

Malware Scanning doesn't automatically block access or change permissions to the uploaded blob, even if it's malicious.

Limitations

Unsupported features and services

Unsupported storage accounts: Legacy v1 storage accounts aren't supported by malware scanning.
Unsupported service: Azure Files isn't supported by malware scanning.
Unsupported regions: Jio India West, Korea South, South Africa West.
Regions that are supported by Defender for Storage but not by malware scanning. Learn more about availability for Defender for Storage.
Unsupported blob types: Append and Page blobs aren't supported for Malware Scanning.
Unsupported encryption: Client-side encrypted blobs aren't supported as they can't be decrypted before scanning by the service. However, data encrypted at rest by Customer Managed Key (CMK) is supported.
Unsupported index tag results: Index tag scan result isn't supported in storage accounts with Hierarchical namespace enabled (Azure Data Lake Storage Gen2).
Event Grid: Event Grid topics that don't have public network access enabled (i.e. private endpoint connections) are not supported by malware scanning in Defender for Storage.

Throughput capacity and blob size limit

Scan throughput rate limit: Malware Scanning can process up to 2 GB per minute for each storage account. If the rate of file upload momentarily exceeds this threshold for a storage account, the system attempts to scan the files in excess of the rate limit. If the rate of file upload consistently exceeds this threshold, some blobs won't be scanned.
Blob scan limit: Malware Scanning can process up to 2,000 files per minute for each storage account. If the rate of file upload momentarily exceeds this threshold for a storage account, the system attempts to scan the files in excess of the rate limit. If the rate of file upload consistently exceeds this threshold, some blobs won't be scanned.
Blob size limit: The maximum size limit for a single blob to be scanned is 2 GB. Blobs that are larger than the limit won't be scanned.

Blob uploads and index tag updates

Upon uploading a blob to the storage account, the malware scanning initiates an extra read operation and updates the index tag. In most cases, these operations don't generate significant load.

Impact on access and storage Input/Output Operations Per Second (IOPS)

Despite the scanning process, access to uploaded data remains unaffected, and the impact on storage IOPS is minimal.