graph_exposure_perimeter_fl()
Applies to: ✅ Microsoft Fabric ✅ Azure Data Explorer ✅ Azure Monitor ✅ Microsoft Sentinel
Calculate the Exposure Perimeter (list and score) of target nodes over path or edge data.
The function graph_exposure_perimeter_fl()
is a UDF (user-defined function) that allows to calculate the Exposure Perimeter of each of the target nodes based on paths or edges data. Each row of input data contains a source node and a target node, which can represent direct connections (edges) between nodes and targets, or longer multi-hop paths between them. If the paths aren't available, we can first discover them using graph-match() operator or graph_path_discovery_fl() function. Then graph_exposure_perimeter_fl()
can be executed on top of the output of path discovery.
Exposure Perimeter represents the accessibility of a specific target from relevant source nodes. The more sources can access the target, the more exposed it's to potential compromise by the attacker - hence the name. Nodes with high Exposure Perimeter are important in cybersecurity domain due to the likelihood they might be reached illegitimately and to being highly valued by attackers. Thus, nodes with high Exposure Perimeter should be protected accordingly - in terms of hardening and monitoring their perimeter.
The function outputs a list of connected sources that can reach each target and also a score representing sources' number. Optionally, in case there's a meaningful 'weight' for each source (such as vulnerability or exposedness), a weighted score is calculated as a sum of sources' weights. In addition, the limits for maximum total number of shown targets and maximum number of sources in each list are exposed as optional parameters for better control.
Syntax
graph_exposure_perimeter_fl(
sourceIdColumnName, targetIdColumnName, [sourceWeightColumnName], [resultCountLimit], [listedIdsLimit])
Learn more about syntax conventions.
Parameters
Name | Type | Required | Description |
---|---|---|---|
sourceIdColumnName | string |
✔️ | The name of the column containing the source node Ids (either for edges or paths). |
targetIdColumnName | string |
✔️ | The name of the column containing the target node Ids (either for edges or paths). |
sourceWeightColumnName | string |
The name of the column containing the source nodes' weights (such as vulnerability). If no relevant weights are present, the weighted score is equal to 0. The default column name is 'noWeightsColumn'. | |
resultCountLimit | long |
The maximum number of returned rows (sorted by descending score). The default value is 100000. | |
listedIdsLimit | long |
The maximum number of targets listed for each source. The default value is 50. |
Function definition
You can define the function by either embedding its code as a query-defined function, or creating it as a stored function in your database, as follows:
Define the function using the following let statement. No permissions are required.
Important
A let statement can't run on its own. It must be followed by a tabular expression statement. To run a working example of graph_exposure_perimeter_fl()
, see Example.
let exposure_perimeter_fl = (T:(*), sourceIdColumnName:string, targetIdColumnName:string, sourceWeightColumnName:string= 'noWeightsColumn'
, resultCountLimit:long = 100000, listedIdsLimit:long = 50)
{
let paths = (
T
| extend sourceId = column_ifexists(sourceIdColumnName, '')
| extend targetId = column_ifexists(targetIdColumnName, '')
| extend sourceWeight = tolong(column_ifexists(sourceWeightColumnName, 0))
);
let aggregatedPaths = (
paths
| sort by targetId, sourceWeight desc
| summarize exposurePerimeterList = array_slice(make_set_if(sourceId, isnotempty(sourceId)), 0, (listedIdsLimit - 1))
, exposurePerimeterScore = dcountif(sourceId, isnotempty(sourceId))
, exposurePerimeterScoreWeighted = sum(sourceWeight)
by targetId
| extend isExposurePerimeterCapped = (exposurePerimeterScore > listedIdsLimit)
);
aggregatedPaths
| top resultCountLimit by exposurePerimeterScore desc
};
// Write your query to use the function here.
Example
The following example uses the invoke operator to run the function.
To use a query-defined function, invoke it after the embedded function definition.
let connections = datatable (SourceNodeName:string, TargetNodeName:string, SourceNodeVulnerability:int)[
'vm-work-1', 'webapp-prd', 0,
'vm-custom', 'webapp-prd', 4,
'webapp-prd', 'vm-custom', 1,
'webapp-prd', 'test-machine', 1,
'vm-custom', 'server-0126', 4,
'vm-custom', 'hub_router', 4,
'webapp-prd', 'hub_router', 2,
'test-machine', 'vm-custom', 5,
'test-machine', 'hub_router', 5,
'hub_router', 'remote_DT', 0,
'vm-work-1', 'storage_main_backup', 0,
'hub_router', 'vm-work-2', 0,
'vm-work-2', 'backup_prc', 1,
'remote_DT', 'backup_prc', 2,
'backup_prc', 'storage_main_backup', 0,
'backup_prc', 'storage_DevBox', 0,
'device_A1', 'sevice_B2', 1,
'sevice_B2', 'device_A1', 2
];
let exposure_perimeter_fl = (T:(*), sourceIdColumnName:string, targetIdColumnName:string, sourceWeightColumnName:string = 'noWeightsColumn'
, resultCountLimit:long = 100000, listedIdsLimit:long = 50)
{
let paths = (
T
| extend sourceId = column_ifexists(sourceIdColumnName, '')
| extend targetId = column_ifexists(targetIdColumnName, '')
| extend sourceWeight = tolong(column_ifexists(sourceWeightColumnName, 0))
);
let aggregatedPaths = (
paths
| sort by targetId, sourceWeight desc
| summarize exposurePerimeterList = array_slice(make_set_if(sourceId, isnotempty(sourceId)), 0, (listedIdsLimit - 1))
, exposurePerimeterScore = dcountif(sourceId, isnotempty(sourceId))
, exposurePerimeterScoreWeighted = sum(sourceWeight)
by targetId
| extend isExposurePerimeterCapped = (exposurePerimeterScore > listedIdsLimit)
);
aggregatedPaths
| top resultCountLimit by exposurePerimeterScore desc
};
connections
| invoke exposure_perimeter_fl(sourceIdColumnName = 'SourceNodeName'
, targetIdColumnName = 'TargetNodeName'
, sourceWeightColumnName = 'SourceNodeVulnerability'
)
Output
targetId | exposurePerimeterList | exposurePerimeterScore | exposurePerimeterScoreWeighted | isExposurePerimeterCapped |
---|---|---|---|---|
hub_router | ["vm-custom","webapp-prd","test-machine"] | 3 | 11 | FALSE |
storage_main_backup | ["vm-work-1","backup_prc"] | 2 | 0 | FALSE |
vm-custom | ["webapp-prd","test-machine"] | 2 | 6 | FALSE |
backup_prc | ["vm-work-2","remote_DT"] | 2 | 3 | FALSE |
webapp-prd | ["vm-work-1","vm-custom"] | 2 | 4 | FALSE |
test-machine | ["webapp-prd"] | 1 | 1 | FALSE |
server-0126 | ["vm-custom"] | 1 | 4 | FALSE |
remote_DT | ["hub_router"] | 1 | 0 | FALSE |
vm-work-2 | ["hub_router"] | 1 | 0 | FALSE |
storage_DevBox | ["backup_prc"] | 1 | 0 | FALSE |
device_A1 | ["sevice_B2"] | 1 | 2 | FALSE |
sevice_B2 | ["device_A1"] | 1 | 1 | FALSE |
Running the function takes the connections between and sources and targets, and aggregates the sources by target. For each target, Exposure Perimeter represents the sources that can connect to it as score (regular and weighted) and list.
Each row in the output contains the following fields:
targetId
: ID of the target node taken from relevant column.exposurePerimeterList
: a list of source nodes Ids (taken from relevant column) that can connect to the target node. The list is capped to maximum length limit of listedIdsLimit parameter.exposurePerimeterScore
: the score is the count of source nodes that can connect to the target. High Exposure Perimeter score indicates that the target node can be potentially accessed from lots of sources, and should be treated accordingly.exposurePerimeterScoreWeighted
: the weighted score is the sum of the optional source nodes' weight column, representing their value - such as vulnerability or exposedness. If such weight exists, weighted Exposure Perimeter score might be a more accurate metric of target node value due to potential access from highly vulnerable or exposed sources.isExposurePerimeterCapped
: boolean flag whether the list of sources was capped by listedIdsLimit parameter. If it's true, then other sources can access the target in addition to the listed ones (up to the number of exposurePerimeterScore).
In the example above we run the graph_exposure_perimeter_fl()
function on top of connections between sources and targets. In the first row of the output, we can see that target node 'hub_router' can be connected from three sources ('vm-custom', 'webapp-prd', 'test-machine'). We use the input data SourceNodeVulnerability column as source weights, and get a cumulative weight of 11. Also, since the number of sources is 3 and the default list limit is 50, all of the sources are shown - so the value of isExposurePerimeterCapped column is FALSE.
In case the multi-hop paths aren't available, we can build multi-hop paths between sources and targets (for example, by running 'graph_path_discovery_fl()') and run 'graph_exposure_perimeter_fl()' on top of the results.
The output looks similar, but will reflect the Exposure Perimeter calculated over multi-hop paths, thus being a better indicator of target nodes true accessibility from relevant sources. In order to find the full paths between source and target scenarios (for example, for disruption), graph_path_discovery_fl() function can be used with filters on relevant source and target nodes.
The function graph_exposure_perimeter_fl()
can be used to calculate the Exposure Perimeter of target nodes, either over direct edges or longer paths. In cybersecurity domain, it can be used for several insights. Exposure Perimeter scores (regular and weighted), represent target node's importance both from defenders and attackers perspectives. Nodes with high Exposure Perimeter (especially critical ones) should be protected accordingly (for example, in terms of access monitoring and hardening); security signals (such as alerts) on sources that can access these nodes should be prioritized. The Exposure Perimeter list should be monitored for undesired connections between sources and targets and used in disruption scenarios (for example, if there was active compromise of one of the sources, connections between it and important target should be broken).