Compare commits
18 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 84057d883f | |||
| be1ac7b9bb | |||
| 5c09e22358 | |||
| eafc75c2f7 | |||
| f7b72d4516 | |||
| a865bd64db | |||
| 6958cc423b | |||
| b264e63627 | |||
| 71e48c6005 | |||
| 002dd3de50 | |||
| 8191c3c743 | |||
| ce2aea85bd | |||
| f289891374 | |||
| fe80f95a19 | |||
| 5295f8165d | |||
| 42565839f9 | |||
| f3bce1b753 | |||
| a676205cfb |
@@ -33,7 +33,7 @@ jobs:
|
||||
run: |
|
||||
python main.py generate_schema ${HOME}/${{ env.FUNCTION_SCHEMA_FILE_NAME }}
|
||||
- name: Speckle Automate Function - Build and Publish
|
||||
uses: specklesystems/speckle-automate-github-composite-action@0.9.0
|
||||
uses: specklesystems/speckle-automate-github-composite-action@0.9.2
|
||||
with:
|
||||
speckle_automate_url: ${{ env.SPECKLE_AUTOMATE_URL || vars.SPECKLE_AUTOMATE_URL || 'https://automate.speckle.dev' }}
|
||||
speckle_token: ${{ secrets.SPECKLE_FUNCTION_TOKEN }}
|
||||
|
||||
@@ -313,3 +313,5 @@ pyrightconfig.json
|
||||
.ionide
|
||||
|
||||
# End of https://www.toptal.com/developers/gitignore/api/visualstudiocode,python,pycharm
|
||||
|
||||
.env-v3
|
||||
|
||||
Generated
+1
@@ -2,6 +2,7 @@
|
||||
<project version="4">
|
||||
<component name="RuffConfigService">
|
||||
<option name="runRuffOnSave" value="true" />
|
||||
<option name="useRuffFormat" value="true" />
|
||||
<option name="useRuffServer" value="true" />
|
||||
</component>
|
||||
</project>
|
||||
@@ -0,0 +1,273 @@
|
||||
# Data Shield - Developer Guide
|
||||
|
||||
This document provides technical information for developers working on the Data Shield Speckle Automate function. It covers deployment workflows, core components, and guidance for extending the function.
|
||||
|
||||
## Deployment to Speckle Automate
|
||||
|
||||
### Creating a Release
|
||||
|
||||
The function is automatically deployed to Speckle Automate when a new release is created on GitHub:
|
||||
|
||||
1. Ensure your changes are committed and pushed to the main branch
|
||||
2. Create a `requirements.txt` file (see next section) and commit to main branch
|
||||
3. Create a new GitHub release:
|
||||
- Go to your repository on GitHub
|
||||
- Navigate to "Releases" under the repository name
|
||||
- Click "Draft a new release"
|
||||
- Create a **new tag** (e.g., `v1.0.1`)
|
||||
- Write a descriptive title and release notes
|
||||
- Click "Publish release"
|
||||
|
||||
Creating a new release triggers the GitHub Actions workflow defined in `main.yml`, which builds and publishes the function to Speckle Automate.
|
||||
|
||||
### Managing Dependencies
|
||||
|
||||
You can use any dependency management tool of your choice for local development (Poetry, pip, uv, etc.), but Speckle Automate requires a `requirements.txt` file for deployment.
|
||||
|
||||
**Important**: You must create and commit the `requirements.txt` file to the repository **before** creating a release. The deployment workflow relies on this file being present in the repository.
|
||||
|
||||
To generate and commit the requirements file based on your local environment:
|
||||
|
||||
- With standard pip: `pip freeze > requirements.txt`
|
||||
- With uv: `uv pip freeze > requirements.txt`
|
||||
- With Poetry: `poetry export -f requirements.txt --output requirements.txt --without-hashes`
|
||||
- Or manually create/edit the file to include necessary dependencies
|
||||
|
||||
Then commit the updated file:
|
||||
```bash
|
||||
git add requirements.txt
|
||||
git commit -m "Update requirements.txt"
|
||||
git push
|
||||
```
|
||||
|
||||
Only after the requirements.txt is committed should you create a new release as described above.
|
||||
|
||||
Note that during deployment, the GitHub Actions workflow uses `uv` to install the dependencies, but your local development environment can use any tool you prefer.
|
||||
|
||||
### Deployment Workflow Details
|
||||
|
||||
The deployment workflow:
|
||||
|
||||
1. Checks out the repository
|
||||
2. Sets up Python 3.13
|
||||
3. Installs dependencies from `requirements.txt`
|
||||
4. Extracts the function schema
|
||||
5. Uses the Speckle Automate GitHub composite action to:
|
||||
- Build a Docker image with the function
|
||||
- Push the image to the Speckle Automate registry
|
||||
- Update the function in Speckle Automate
|
||||
|
||||
## Core Components
|
||||
|
||||
### Parameter Matching System
|
||||
|
||||
The function uses a strategy pattern for parameter matching, allowing flexible and extensible matching rules:
|
||||
|
||||
#### ParameterMatcher Classes
|
||||
|
||||
* `ParameterMatcher` (ABC): Abstract base class for all matchers
|
||||
* `PrefixMatcher`: Matches parameters by prefix (with optional case sensitivity)
|
||||
* `PatternMatcher`: Uses regex/glob patterns for more complex matching
|
||||
|
||||
```python
|
||||
# Example: Creating a custom matcher
|
||||
class SuffixMatcher(ParameterMatcher):
|
||||
"""Matches parameters by suffix."""
|
||||
|
||||
def matches(self, param_name: str) -> bool:
|
||||
"""Check if the parameter name ends with the match value."""
|
||||
if self.strict_mode:
|
||||
return param_name.endswith(self.match_value)
|
||||
return param_name.lower().endswith(self.match_value.lower())
|
||||
```
|
||||
|
||||
#### Pattern Checking
|
||||
|
||||
The `PatternChecker` class handles both glob-style patterns (e.g., `speckle_*`) and regular expressions (e.g., `/^speckle_\d+$/i`):
|
||||
|
||||
* Glob patterns use `fnmatch` for simple wildcard matching
|
||||
* Regex patterns must be wrapped in slashes (`/pattern/`)
|
||||
* Case sensitivity is controlled by:
|
||||
- The global `strict_mode` parameter
|
||||
- The `/i` flag for regex patterns (overrides `strict_mode`)
|
||||
|
||||
### Traversal System
|
||||
|
||||
The function uses Speckle's graph traversal system to navigate the complex object hierarchy:
|
||||
|
||||
1. `GraphTraversal` from `specklepy.objects.graph_traversal.traversal` defines rules for how to navigate objects
|
||||
2. `TraversalRule` objects define:
|
||||
- Conditions for when a rule applies to an object
|
||||
- Methods to extract the next objects to traverse
|
||||
3. Our custom rules in `traversal.py` focus on:
|
||||
- `display_value_rule`: For objects with displayValue/elements properties
|
||||
- `default_rule`: General fallback for traversing all object members
|
||||
|
||||
The traversal system provides contexts that contain:
|
||||
- The current object being traversed
|
||||
- The path taken to reach that object
|
||||
- Other metadata used during traversal
|
||||
|
||||
### Parameter Actions
|
||||
|
||||
Actions implement the logic for what to do when a parameter match is found:
|
||||
|
||||
#### ParameterAction Classes
|
||||
|
||||
* `ParameterAction` (ABC): Abstract base class for all actions
|
||||
* `RemovalAction`: Removes matching parameters from objects
|
||||
* `AnonymizationAction`: Masks email addresses in parameter values
|
||||
|
||||
Each action implements:
|
||||
- `check()`: Determines if the action should be applied
|
||||
- `apply()`: Performs the action on a matching parameter
|
||||
- `report()`: Generates feedback for the Automate context
|
||||
|
||||
```python
|
||||
# Example: Creating a custom action
|
||||
class TransformAction(ParameterAction):
|
||||
"""Action to transform parameter values based on a rule."""
|
||||
|
||||
def __init__(self, matcher: ParameterMatcher, transform_func) -> None:
|
||||
"""Initialize with a matcher strategy and transform function."""
|
||||
super().__init__()
|
||||
self.matcher = matcher
|
||||
self.transform_func = transform_func
|
||||
|
||||
def check(self, param_name: str) -> bool:
|
||||
"""Check if parameter matches using the provided matcher."""
|
||||
return self.matcher.matches(param_name)
|
||||
|
||||
def apply(self, parameter, parent_object, containing_dict, parameter_key) -> None:
|
||||
"""Transform the parameter value."""
|
||||
param_name = parameter.get("name", parameter_key)
|
||||
object_id = getattr(parent_object, "id", None)
|
||||
|
||||
if "value" in parameter and isinstance(parameter["value"], str):
|
||||
parameter["value"] = self.transform_func(parameter["value"])
|
||||
|
||||
# Track affected object and parameter
|
||||
self.affected_parameters[object_id].append(param_name)
|
||||
|
||||
def report(self, automate_context: AutomationContext) -> None:
|
||||
"""Report the transformed parameters."""
|
||||
if not self.affected_parameters:
|
||||
return
|
||||
|
||||
transformed_params = set(param for params in self.affected_parameters.values() for param in params)
|
||||
|
||||
message = f"Transformed {len(transformed_params)} parameters"
|
||||
|
||||
automate_context.attach_info_to_objects(
|
||||
category="Transformed_Parameters",
|
||||
object_ids=list(self.affected_parameters.keys()),
|
||||
message=message,
|
||||
)
|
||||
```
|
||||
|
||||
#### Parameter Processing
|
||||
|
||||
The `ParameterProcessor` class orchestrates the application of actions:
|
||||
|
||||
1. Takes an action and a flag indicating whether to check parameter names or values
|
||||
2. Processes traversal contexts by examining properties and parameters
|
||||
3. Handles both modern (v3) and legacy (v2) Speckle objects
|
||||
4. Applies the action to matching parameters
|
||||
5. Tracks processed objects for reporting
|
||||
|
||||
### Adding New Sanitization Modes
|
||||
|
||||
To add a new sanitization mode:
|
||||
|
||||
1. Update the `SanitizationMode` enum in `inputs.py`:
|
||||
```python
|
||||
class SanitizationMode(Enum):
|
||||
PREFIX_MATCHING = "Prefix Matching"
|
||||
PATTERN_MATCHING = "Pattern Matching"
|
||||
ANONYMIZATION = "Anonymization"
|
||||
NEW_MODE = "Your New Mode" # Add your new mode here
|
||||
```
|
||||
|
||||
2. Create any necessary new matchers or actions in `actions.py`
|
||||
|
||||
3. Update the `automate_function` in `function.py` to handle the new mode:
|
||||
```python
|
||||
if function_inputs.sanitization_mode == SanitizationMode.NEW_MODE:
|
||||
# Add specific validation for your new mode
|
||||
action = create_your_new_action() # Create a factory function for your action
|
||||
```
|
||||
|
||||
## Function Flow
|
||||
|
||||
The main function flow is:
|
||||
|
||||
1. User selects a sanitization mode and provides parameters via the UI
|
||||
2. Function creates the appropriate action based on the mode
|
||||
3. Version data is received from Speckle
|
||||
4. Traversal rules navigate through the object tree
|
||||
5. Parameters are processed with the selected action
|
||||
6. Results are reported back to the Automate context
|
||||
7. A new sanitized version is created
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- [Speckle Automate Documentation](https://automate.speckle.dev/)
|
||||
- [Speckle Python SDK Documentation](https://speckle.guide/dev/python.html)
|
||||
- [Pydantic Documentation](https://docs.pydantic.dev/) (for function inputs)
|
||||
|
||||
## Testing
|
||||
|
||||
### Local Testing with pytest
|
||||
|
||||
pytest is the recommended way to test Speckle Automate functions locally. This allows you to verify your function works correctly before deploying it.
|
||||
|
||||
1. Set up your test environment by creating a `.env` file with your Speckle credentials:
|
||||
|
||||
```
|
||||
SPECKLE_TOKEN="9a110400812dc32b57e524c9c6f1a2000ebabec1c9"
|
||||
SPECKLE_SERVER_URL="https://app.speckle.systems/"
|
||||
SPECKLE_PROJECT_ID="d94c63b75d"
|
||||
SPECKLE_AUTOMATION_ID="99896f98b6"
|
||||
```
|
||||
|
||||
2. Run the tests with your preferred method:
|
||||
|
||||
```bash
|
||||
# Using pytest directly
|
||||
python -m pytest
|
||||
|
||||
# Or if using a virtual environment tool
|
||||
# poetry run pytest
|
||||
```
|
||||
|
||||
The tests in `test_function.py` provide examples of how to set up the automation context and run the function with different inputs.
|
||||
|
||||
### Setting Up a Test Automation
|
||||
|
||||
To properly test your function, you should:
|
||||
|
||||
1. Create a test automation in Speckle Automate
|
||||
2. Use the provided IDs and token in your `.env` file
|
||||
3. This allows your tests to interact with actual Speckle objects and verify the function's behavior
|
||||
|
||||
The `speckle-automate` package provides fixtures that help with loading these environment variables and setting up the test context automatically.
|
||||
|
||||
Example test setup:
|
||||
|
||||
```python
|
||||
def test_function_run(test_automation_run_data: AutomationRunData, test_automation_token: str) -> None:
|
||||
"""Run an integration test for the automate function."""
|
||||
automation_context = AutomationContext.initialize(test_automation_run_data, test_automation_token)
|
||||
|
||||
# Run your function with test inputs
|
||||
automate_sdk = run_function(
|
||||
automation_context,
|
||||
automate_function,
|
||||
FunctionInputs(sanitization_mode=SanitizationMode.PATTERN_MATCHING, parameter_input="test_*", strict_mode=True),
|
||||
)
|
||||
|
||||
# Verify the results
|
||||
assert automate_sdk.run_status == AutomationStatus.SUCCEEDED
|
||||
```
|
||||
|
||||
The fixtures `test_automation_run_data` and `test_automation_token` are provided by the `speckle-automate` package and automatically use the values from your `.env` file.
|
||||
@@ -1,100 +1,117 @@
|
||||
# Speckle Automate function template - Python
|
||||
# 🛡️ Data Shield — User Guide
|
||||
|
||||
This template repository is for a Speckle Automate function written in Python
|
||||
using the [specklepy](https://pypi.org/project/specklepy/) SDK to interact with Speckle data.
|
||||
**Data Shield** is a Speckle Automate function that helps you keep your model data clean, safe, and share-ready. Whether you're sending models to clients or collaborators or just tidying up before archiving, Data Shield has your back.
|
||||
|
||||
This template contains the full scaffolding required to publish a function to the Automate environment.
|
||||
It also has some sane defaults for development environment setups.
|
||||
## ✨ What Data Shield Does
|
||||
|
||||
## Getting started
|
||||
Data Shield scans your Speckle model for parameters you’d rather not share and takes care of them for you. It creates a fresh, sanitized version of your model while keeping the original intact.
|
||||
|
||||
1. Use this template repository to create a new repository in your own / organization's profile.
|
||||
### Why you’ll love it:
|
||||
- **Privacy Protection** — Say goodbye to accidentally sharing sensitive data.
|
||||
- **Data Compliance** — Stay on the right side of data protection policies.
|
||||
- **Confident Collaboration** — Share models without oversharing.
|
||||
|
||||
Register the function
|
||||
---
|
||||
|
||||
### Add new dependencies
|
||||
## Shield Modes
|
||||
|
||||
To add new Python package dependencies to the project, use the following:
|
||||
`$ poetry add pandas`
|
||||
We know one size doesn’t fit all, so Data Shield offers three modes to suit your style:
|
||||
|
||||
### Change launch variables
|
||||
### Prefix Matching
|
||||
> **Best for:** Simple, predictable naming conventions.
|
||||
|
||||
Describe how the launch.json should be edited.
|
||||
Remove parameters that start with a specific prefix.
|
||||
> Example: Want to remove everything starting with `secret_`? Just set that prefix, and Data Shield will do the rest.
|
||||
|
||||
### Github Codespaces
|
||||
**Setup**:
|
||||
- Add your prefix (like `internal_`, `private_`, or `secret_`)
|
||||
- Toggle strict mode for case sensitivity (on or off — your call)
|
||||
|
||||
Create a new repo from this template, and use the create new code.
|
||||
---
|
||||
|
||||
### Using this Speckle Function
|
||||
### Pattern Matching
|
||||
> **Best for:** Wildcards, regex fans, and complex patterns.
|
||||
|
||||
1. [Create](https://automate.speckle.dev/) a new Speckle Automation.
|
||||
1. Select your Speckle Project and Speckle Model.
|
||||
1. Select the deployed Speckle Function.
|
||||
1. Enter a phrase to use in the comment.
|
||||
1. Click `Create Automation`.
|
||||
Get fancy and use `*`, `?`, or full regular expressions.
|
||||
|
||||
## Getting Started with Creating Your Own Speckle Function
|
||||
**Examples**:
|
||||
- `client_*` matches anything that starts with `client_`
|
||||
- `?_internal` matches `a_internal`, `b_internal`
|
||||
- `/^(secret|private)_.*$/i` matches parameters starting with `secret_` or `private_`, ignoring case
|
||||
|
||||
1. [Register](https://automate.speckle.dev/) your Function with [Speckle Automate](https://automate.speckle.dev/) and select the Python template.
|
||||
1. A new repository will be created in your GitHub account.
|
||||
1. Make changes to your Function in `main.py`. See below for the Developer Requirements and instructions on how to test.
|
||||
1. To create a new version of your Function, create a new [GitHub release](https://docs.github.com/en/repositories/releasing-projects-on-github/managing-releases-in-a-repository) in your repository.
|
||||
---
|
||||
|
||||
## Developer Requirements
|
||||
### Anonymization
|
||||
> **Best for:** Keeping the structure and hiding the details.
|
||||
|
||||
1. Install the following:
|
||||
- [Python 3](https://www.python.org/downloads/)
|
||||
- [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer)
|
||||
1. Run `poetry shell && poetry install` to install the required Python packages.
|
||||
Automatically detect email addresses inside parameter values and anonymize them.
|
||||
> Example:
|
||||
>
|
||||
> 
|
||||
|
||||
## Building and Testing
|
||||
|
||||
The code can be tested locally by running `poetry run pytest`.
|
||||
No setup is needed. Just select and go.
|
||||
|
||||
### Building and running the Docker Container Image
|
||||
---
|
||||
|
||||
Running and testing your code on your machine is a great way to develop your Function; the following instructions are a bit more in-depth and only required if you are having issues with your Function in GitHub Actions or on Speckle Automate.
|
||||
## How to Use Data Shield
|
||||
|
||||
#### Building the Docker Container Image
|
||||
1. **Set up your automation:**
|
||||
- In your Speckle project, head to **Automations**
|
||||
- Click **Add Automation** and choose **Data Shield**
|
||||
- Set your trigger model
|
||||
|
||||
The GitHub Action packages your code into the format required by Speckle Automate. This is done by building a Docker Image, which Speckle Automate runs. You can attempt to build the Docker Image locally to test the building process.
|
||||
2. **Configure your mode:**
|
||||
- Choose Prefix, Pattern, or Anonymization
|
||||
- Add your prefix or pattern if needed
|
||||
- Toggle strict mode if you want case sensitivity
|
||||
|
||||
To build the Docker Container Image, you must have [Docker](https://docs.docker.com/get-docker/) installed.
|
||||
3. **Run it:**
|
||||
- It’ll run automatically when a new version is published — or you can manually run it
|
||||
|
||||
Once you have Docker running on your local machine:
|
||||
4. **Check results:**
|
||||
- Shielded models show up under the `processed/` branch
|
||||
- You’ll get a run report showing what got cleaned
|
||||
- Highlighted changes can be seen directly in the viewer
|
||||
|
||||
1. Open a terminal
|
||||
1. Navigate to the directory in which you cloned this repository
|
||||
1. Run the following command:
|
||||
---
|
||||
## 💡 Tips & Tricks
|
||||
- **Test first!** — Run it on a small test model before going full production.
|
||||
- **Start simple.** Use prefix matching for clear conventions, pattern matching for complexity, or anonymization for safe sharing.
|
||||
- **Regex pro tip:**
|
||||
- Wrap your regex in `/`
|
||||
- Add `i` for case-insensitive matching
|
||||
- Use `^` (start) and `$` (end) for tighter control
|
||||
---
|
||||
## 📚 Example Workflows
|
||||
|
||||
```bash
|
||||
docker build -f ./Dockerfile -t speckle_automate_python_example .
|
||||
```
|
||||
### → Prepping for external sharing
|
||||
- Use pattern matching with `/^(internal|private|confidential)_.*$/i`
|
||||
- Run before sending out models
|
||||
- Share confidently!
|
||||
|
||||
#### Running the Docker Container Image
|
||||
### → Anonymizing client data
|
||||
- Select the Anonymization mode
|
||||
- Run on any models with contact details
|
||||
- Use sanitized versions for demos, public decks, or sales pitches
|
||||
|
||||
Once the GitHub Action has built the image, it is sent to Speckle Automate. When Speckle Automate runs your Function as part of an Automation, it will run the Docker Container Image. You can test that your Docker Container Image runs correctly locally.
|
||||
### → Stripping out project-specific baggage
|
||||
- Prefix matching with something like `projectX_`
|
||||
- Clean your models before turning them into templates
|
||||
|
||||
1. To then run the Docker Container Image, run the following command:
|
||||
---
|
||||
|
||||
```bash
|
||||
docker run --rm speckle_automate_python_example \
|
||||
python -u main.py run \
|
||||
'{"projectId": "1234", "modelId": "1234", "branchName": "myBranch", "versionId": "1234", "speckleServerUrl": "https://speckle.xyz", "automationId": "1234", "automationRevisionId": "1234", "automationRunId": "1234", "functionId": "1234", "functionName": "my function", "functionLogo": "base64EncodedPng"}' \
|
||||
'{}' \
|
||||
yourSpeckleServerAuthenticationToken
|
||||
```
|
||||
## 🛠️ Troubleshooting
|
||||
|
||||
Let's explain this in more detail:
|
||||
- **Not matching anything?** Double-check your pattern or prefix.
|
||||
- **Case mismatch?** Try turning off strict mode.
|
||||
- **Only partly sanitized?** Some complex models might need multiple passes.
|
||||
- **Errors?** Check run logs in the automation report for clues.
|
||||
- **Next Gen vs Legacy**: While v3 data objects are supported, if you're using non-Revit v2 objects, you might experience varied results. Please report any issues.
|
||||
|
||||
`docker run—-rm speckle_automate_python_example` tells Docker to run the Docker Container Image we built earlier. `speckle_automate_python_example` is the name of the Docker Container Image. The `--rm` flag tells Docker to remove the container after it has finished running, freeing up space on your machine.
|
||||
---
|
||||
|
||||
The line `python -u main.py run` is the command run inside the Docker Container Image. The rest of the command is the arguments passed to the command. The arguments are:
|
||||
## 🤔 Still stuck?
|
||||
|
||||
- `'{"projectId": "1234", "modelId": "1234", "branchName": "myBranch", "versionId": "1234", "speckleServerUrl": "https://speckle.xyz", "automationId": "1234", "automationRevisionId": "1234", "automationRunId": "1234", "functionId": "1234", "functionName": "my function", "functionLogo": "base64EncodedPng"}'` - the metadata that describes the automation and the function.
|
||||
- `{}` - the input parameters for the function the Automation creator can set. Here, they are blank, but you can add your parameters to test your function.
|
||||
- `yourSpeckleServerAuthenticationToken`—the authentication token for the Speckle Server that the Automation can connect to. This is required to interact with the Speckle Server, for example, to get data from the Model.
|
||||
|
||||
## Resources
|
||||
|
||||
- [Learn](https://speckle.guide/dev/python.html) more about SpecklePy and interacting with Speckle from Python.
|
||||
No worries — we’ve got your back.
|
||||
👉 Post your questions in the [Speckle Community Forum](https://speckle.community) and someone from the team (or one of our awesome community members) will help you out!
|
||||
|
||||
+126
-24
@@ -1,5 +1,5 @@
|
||||
"""Module for parameter actions and matching strategies."""
|
||||
import re
|
||||
|
||||
from abc import ABC, abstractmethod
|
||||
from collections import defaultdict
|
||||
from typing import Any
|
||||
@@ -7,7 +7,7 @@ from typing import Any
|
||||
from speckle_automate import AutomationContext
|
||||
from specklepy.objects import Base
|
||||
|
||||
from data_shield.helpers import PatternChecker
|
||||
from data_shield.matchers import EmailMatcher, PatternChecker
|
||||
|
||||
|
||||
class ParameterMatcher(ABC):
|
||||
@@ -30,7 +30,7 @@ class PrefixMatcher(ParameterMatcher):
|
||||
def matches(self, param_name: str) -> bool:
|
||||
"""Check if the parameter name starts with the match value."""
|
||||
if self.strict_mode:
|
||||
return param_name == self.match_value
|
||||
return param_name.startswith(self.match_value)
|
||||
return param_name.lower().startswith(self.match_value.lower())
|
||||
|
||||
|
||||
@@ -79,29 +79,60 @@ class RemovalAction(ParameterAction):
|
||||
return self.matcher.matches(param_name)
|
||||
|
||||
def apply(
|
||||
self,
|
||||
parameter: dict[str, Any],
|
||||
parent_object: Base,
|
||||
containing_dict: dict[str, Any],
|
||||
parameter_key: str
|
||||
self, parameter: dict[str, Any], parent_object: Base, containing_dict: dict[str, Any] | Base, parameter_key: str
|
||||
) -> None:
|
||||
"""Remove the parameter from the containing dictionary if it matches."""
|
||||
param_name = parameter.get("name", parameter_key)
|
||||
"""Remove the parameter from the containing dictionary if it matches.
|
||||
|
||||
# Remove from the containing dictionary
|
||||
containing_dict.pop(parameter_key, None)
|
||||
This method handles both dictionary-style containers and Base objects with attributes.
|
||||
|
||||
Args:
|
||||
parameter: The parameter dictionary or object
|
||||
parent_object: The parent Speckle object
|
||||
containing_dict: The container (dict or Base object) holding the parameter
|
||||
parameter_key: The key or attribute name of the parameter
|
||||
"""
|
||||
param_name = parameter.get("name", parameter_key)
|
||||
object_id = getattr(parent_object, "id", None)
|
||||
|
||||
# Handle removal based on the container type
|
||||
if isinstance(containing_dict, dict):
|
||||
# Standard dictionary - just pop the key
|
||||
containing_dict.pop(parameter_key, None)
|
||||
elif isinstance(containing_dict, Base):
|
||||
# For Base objects like Revit parameters, try to remove using __dict__
|
||||
try:
|
||||
if hasattr(containing_dict, "__dict__") and parameter_key in containing_dict.__dict__:
|
||||
containing_dict.__dict__.pop(parameter_key)
|
||||
else:
|
||||
# If not in __dict__, try using dynamic attribute removal
|
||||
containing_dict.__dict__.pop(parameter_key, None)
|
||||
except (AttributeError, KeyError, TypeError):
|
||||
# Fallback to alternative methods if direct dict manipulation fails
|
||||
try:
|
||||
delattr(containing_dict, parameter_key)
|
||||
except (AttributeError, TypeError):
|
||||
try:
|
||||
setattr(containing_dict, parameter_key, None)
|
||||
except (AttributeError, TypeError):
|
||||
# If all removal attempts fail, try one more approach specific to Speckle Base objects
|
||||
if (
|
||||
hasattr(containing_dict, "get_dynamic_member_names")
|
||||
and parameter_key in containing_dict.get_dynamic_member_names()
|
||||
):
|
||||
# This is a workaround for dynamic properties in Speckle Base objects
|
||||
application_name = parameter.get("applicationInternalName", parameter_key)
|
||||
if application_name in containing_dict.__dict__:
|
||||
containing_dict.__dict__.pop(application_name)
|
||||
|
||||
# Track affected object and parameter
|
||||
self.affected_parameters[getattr(parent_object, "id", None)].append(param_name)
|
||||
self.affected_parameters[object_id].append(param_name)
|
||||
|
||||
def report(self, automate_context: AutomationContext) -> None:
|
||||
"""Provide feedback based on the action's results."""
|
||||
if not self.affected_parameters:
|
||||
return
|
||||
|
||||
removed_params = set(
|
||||
param for params in self.affected_parameters.values() for param in params
|
||||
)
|
||||
removed_params = set(param for params in self.affected_parameters.values() for param in params)
|
||||
|
||||
message = f"The following parameters were removed: {', '.join(removed_params)}"
|
||||
|
||||
@@ -112,6 +143,81 @@ class RemovalAction(ParameterAction):
|
||||
)
|
||||
|
||||
|
||||
class AnonymizationAction(ParameterAction):
|
||||
"""Action to anonymize email addresses in parameter values."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
"""Initialize the anonymization action with an email matcher."""
|
||||
super().__init__()
|
||||
self.email_matcher = EmailMatcher()
|
||||
self.anonymized_count = 0
|
||||
|
||||
def check(self, param_value: str) -> bool:
|
||||
"""Check if parameter value contains an email address."""
|
||||
return isinstance(param_value, str) and self.email_matcher.contains_email(param_value)
|
||||
|
||||
def apply(
|
||||
self, parameter: dict[str, Any], parent_object: Base, containing_dict: dict[str, Any] | Base, parameter_key: str
|
||||
) -> None:
|
||||
"""Anonymize email addresses in the parameter value."""
|
||||
# Get parameter name and object ID - same as RemovalAction
|
||||
param_name = parameter.get("name", parameter_key)
|
||||
object_id = getattr(parent_object, "id", None)
|
||||
|
||||
# Get the value to anonymize
|
||||
param_value = None
|
||||
|
||||
# For dictionary-style parameters
|
||||
if isinstance(parameter, dict) and "value" in parameter:
|
||||
param_value = parameter["value"]
|
||||
if isinstance(param_value, str) and self.email_matcher.contains_email(param_value):
|
||||
# Anonymize and update
|
||||
anonymized_value = self.email_matcher.anonymize_email(param_value)
|
||||
parameter["value"] = anonymized_value
|
||||
|
||||
# Track affected parameters - EXACTLY like RemovalAction does
|
||||
self.affected_parameters[object_id].append(param_name)
|
||||
self.anonymized_count += 1
|
||||
|
||||
# For Base object parameters (like in Revit)
|
||||
elif isinstance(containing_dict, Base):
|
||||
try:
|
||||
# Try to get the parameter object
|
||||
param_obj = None
|
||||
try:
|
||||
param_obj = containing_dict.__getitem__(parameter_key)
|
||||
except (AttributeError, KeyError, TypeError):
|
||||
param_obj = getattr(containing_dict, parameter_key, None)
|
||||
|
||||
if param_obj and hasattr(param_obj, "value"):
|
||||
param_value = getattr(param_obj, "value")
|
||||
if isinstance(param_value, str) and self.email_matcher.contains_email(param_value):
|
||||
# Anonymize and update
|
||||
anonymized_value = self.email_matcher.anonymize_email(param_value)
|
||||
setattr(param_obj, "value", anonymized_value)
|
||||
|
||||
# Track affected parameters - EXACTLY like RemovalAction does
|
||||
self.affected_parameters[object_id].append(param_name)
|
||||
self.anonymized_count += 1
|
||||
except KeyError:
|
||||
pass # Skip if any error occurs
|
||||
|
||||
def report(self, automate_context: AutomationContext) -> None:
|
||||
"""Provide feedback based on the action's results."""
|
||||
if not self.affected_parameters:
|
||||
return
|
||||
|
||||
# Copy the exact pattern from RemovalAction for consistency
|
||||
anonymized_params = set(param for params in self.affected_parameters.values() for param in params)
|
||||
message = f"Email addresses were anonymized in {len(anonymized_params)} parameters"
|
||||
|
||||
automate_context.attach_info_to_objects(
|
||||
category="Anonymized_Parameters",
|
||||
object_ids=list(self.affected_parameters.keys()),
|
||||
message=message,
|
||||
)
|
||||
|
||||
|
||||
# Factory functions to create specific actions with the right matcher
|
||||
def create_prefix_removal_action(forbidden_prefix: str, strict_mode: bool = False) -> RemovalAction:
|
||||
"""Create a removal action that matches by prefix."""
|
||||
@@ -125,11 +231,7 @@ def create_pattern_removal_action(pattern: str, strict_mode: bool = False) -> Re
|
||||
return RemovalAction(matcher)
|
||||
|
||||
|
||||
# Placeholder for future anonymization action
|
||||
def create_anonymization_action() -> None:
|
||||
"""Create an action that anonymizes email addresses in parameter values.
|
||||
|
||||
This is a placeholder for future implementation.
|
||||
"""
|
||||
# To be implemented
|
||||
return None
|
||||
# Factory function to create anonymization action
|
||||
def create_anonymization_action() -> AnonymizationAction:
|
||||
"""Create an action that anonymizes email addresses in parameter values."""
|
||||
return AnonymizationAction()
|
||||
|
||||
+24
-74
@@ -1,69 +1,20 @@
|
||||
"""Main Automate function for parameter sanitization."""
|
||||
from speckle_automate import AutomationContext
|
||||
from specklepy.objects import Base
|
||||
"""Updated main Automate function for parameter sanitization."""
|
||||
|
||||
from data_shield.actions import ParameterAction, create_pattern_removal_action, create_prefix_removal_action
|
||||
from speckle_automate import AutomationContext
|
||||
|
||||
from data_shield.actions import (
|
||||
create_anonymization_action,
|
||||
create_pattern_removal_action,
|
||||
create_prefix_removal_action,
|
||||
)
|
||||
from data_shield.helpers import ParameterProcessor
|
||||
from data_shield.inputs import FunctionInputs, SanitizationMode
|
||||
from data_shield.traversal import get_data_traversal_rules
|
||||
|
||||
|
||||
class ParameterProcessor:
|
||||
"""Class to handle parameter processing with a removal action."""
|
||||
|
||||
def __init__(self, action: ParameterAction):
|
||||
"""Initialize the parameter processor with a removal action.
|
||||
|
||||
Args:
|
||||
action: The parameter action to apply
|
||||
"""
|
||||
self.action = action
|
||||
self.processed_objects = set()
|
||||
|
||||
def process_context(self, context):
|
||||
"""Process a traversal context to handle parameters and properties.
|
||||
|
||||
Args:
|
||||
context: The traversal context containing the current object
|
||||
"""
|
||||
current_object = context.current
|
||||
|
||||
# Prioritise v3
|
||||
if hasattr(current_object, "properties") and current_object.properties is not None:
|
||||
properties_dict = (
|
||||
current_object.properties.__dict__
|
||||
if isinstance(current_object.properties, Base)
|
||||
else current_object.properties
|
||||
)
|
||||
self.process_properties_dict(properties_dict, current_object)
|
||||
|
||||
# Legacy placeholder for v2, ready for later
|
||||
if hasattr(current_object, "parameters") and current_object.parameters is not None:
|
||||
pass # Add v2 handling when ready
|
||||
|
||||
def process_properties_dict(self, properties_dict, current_object):
|
||||
"""Recursively process v3-style properties dictionary to find and apply the action to parameters.
|
||||
|
||||
Args:
|
||||
properties_dict: The properties dictionary to process
|
||||
current_object: The current object being processed
|
||||
"""
|
||||
for key, value in list(properties_dict.items()): # Safe iteration during mutation
|
||||
if isinstance(value, dict) and "value" in value:
|
||||
param_name = value.get("name", key)
|
||||
|
||||
# Check if parameter matches our criteria
|
||||
if self.action.check(param_name):
|
||||
self.action.apply(value, current_object, properties_dict, key)
|
||||
self.processed_objects.add(current_object.id)
|
||||
|
||||
elif isinstance(value, dict):
|
||||
# Recurse into nested dictionaries
|
||||
self.process_properties_dict(value, current_object)
|
||||
|
||||
|
||||
def automate_function(
|
||||
automate_context: AutomationContext,
|
||||
function_inputs: FunctionInputs,
|
||||
automate_context: AutomationContext,
|
||||
function_inputs: FunctionInputs,
|
||||
) -> None:
|
||||
"""Main function for parameter sanitization.
|
||||
|
||||
@@ -73,37 +24,32 @@ def automate_function(
|
||||
"""
|
||||
# Create appropriate action based on sanitization mode
|
||||
action = None
|
||||
check_values = False
|
||||
|
||||
if function_inputs.sanitization_mode == SanitizationMode.PREFIX_MATCHING:
|
||||
if not function_inputs.parameter_input:
|
||||
automate_context.mark_run_failed("No parameter prefix has been set for PREFIX_MATCHING mode.")
|
||||
return
|
||||
action = create_prefix_removal_action(
|
||||
function_inputs.parameter_input,
|
||||
function_inputs.strict_mode
|
||||
)
|
||||
action = create_prefix_removal_action(function_inputs.parameter_input, function_inputs.strict_mode)
|
||||
|
||||
elif function_inputs.sanitization_mode == SanitizationMode.PATTERN_MATCHING:
|
||||
if not function_inputs.parameter_input:
|
||||
automate_context.mark_run_failed("No parameter pattern has been set for PATTERN_MATCHING mode.")
|
||||
return
|
||||
action = create_pattern_removal_action(
|
||||
function_inputs.parameter_input,
|
||||
function_inputs.strict_mode
|
||||
)
|
||||
action = create_pattern_removal_action(function_inputs.parameter_input, function_inputs.strict_mode)
|
||||
|
||||
elif function_inputs.sanitization_mode == SanitizationMode.ANONYMIZATION:
|
||||
# Anonymization doesn't require a parameter input
|
||||
# Add anonymization action here when implemented
|
||||
automate_context.mark_run_failed("ANONYMIZATION mode not yet implemented.")
|
||||
return
|
||||
# Anonymization doesn't require a parameter input as it automatically detects emails
|
||||
action = create_anonymization_action()
|
||||
# For anonymization, we check values, not names
|
||||
check_values = True
|
||||
|
||||
if not action:
|
||||
automate_context.mark_run_failed("Failed to create a valid action.")
|
||||
return
|
||||
|
||||
# Process the model with the selected action
|
||||
processor = ParameterProcessor(action)
|
||||
processor = ParameterProcessor(action, check_values)
|
||||
|
||||
version_root_object = automate_context.receive_version()
|
||||
speckle_data = get_data_traversal_rules()
|
||||
@@ -142,4 +88,8 @@ def automate_function(
|
||||
# We can pin the result view to the specific version we created.
|
||||
automate_context.set_context_view([f"{new_model_id}@{new_version_id}"], False)
|
||||
|
||||
automate_context.mark_run_success("Parameters processed successfully.")
|
||||
automate_context.mark_run_success(
|
||||
f"Parameters processed successfully with shield function "
|
||||
f"{function_inputs.sanitization_mode.value}"
|
||||
f"{' running in strict mode' if function_inputs.strict_mode else ''}."
|
||||
)
|
||||
|
||||
+161
-33
@@ -1,44 +1,172 @@
|
||||
"""Helper classes and functions for the parameter checker."""
|
||||
import fnmatch
|
||||
import re
|
||||
|
||||
from specklepy.objects import Base
|
||||
|
||||
from data_shield.actions import ParameterAction
|
||||
|
||||
|
||||
class PatternChecker:
|
||||
"""Checks if a parameter name matches a user-defined pattern."""
|
||||
class ParameterProcessor:
|
||||
"""Class to handle parameter processing with various actions."""
|
||||
|
||||
def __init__(self, pattern: str, strict: bool = True):
|
||||
"""Initializes the pattern checker.
|
||||
def __init__(self, action: ParameterAction, check_values: bool = False):
|
||||
"""Initialize the parameter processor with an action.
|
||||
|
||||
Args:
|
||||
pattern: User-defined pattern. Glob by default; /regex/ for regex; /regex/i for ignore-case.
|
||||
strict: Switches case-insensitive matching for both glob and regex (unless overridden by /i in regex).
|
||||
action: The parameter action to apply
|
||||
check_values: If True, check parameter values instead of names
|
||||
"""
|
||||
self.is_regex = pattern.startswith('/') and (pattern.rstrip('i').endswith('/'))
|
||||
self.user_strict = strict
|
||||
self.action = action
|
||||
self.check_values = check_values
|
||||
self.processed_objects = set()
|
||||
# Debug counters
|
||||
self.total_objects_processed = 0
|
||||
self.revit_params_processed = 0
|
||||
|
||||
if self.is_regex:
|
||||
# Check for inline ignore-case flag
|
||||
if pattern.endswith('/i'):
|
||||
self.ignore_case = True
|
||||
pattern_body = pattern[1:-2]
|
||||
def process_context(self, context):
|
||||
"""Process a traversal context to handle parameters and properties.
|
||||
|
||||
Args:
|
||||
context: The traversal context containing the current object
|
||||
"""
|
||||
current_object = context.current
|
||||
self.total_objects_processed += 1
|
||||
|
||||
# First handle modern v3 properties
|
||||
if hasattr(current_object, "properties") and current_object.properties is not None:
|
||||
properties_dict = (
|
||||
current_object.properties.__dict__
|
||||
if isinstance(current_object.properties, Base)
|
||||
else current_object.properties
|
||||
)
|
||||
self.process_properties_dict(properties_dict, current_object)
|
||||
|
||||
# Then handle legacy v2 Revit parameters
|
||||
if hasattr(current_object, "parameters") and current_object.parameters is not None:
|
||||
self.process_revit_parameters(current_object)
|
||||
|
||||
def process_properties_dict(self, properties_dict, current_object):
|
||||
"""Recursively process v3-style properties dictionary to find and apply the action to parameters.
|
||||
|
||||
Args:
|
||||
properties_dict: The properties dictionary to process
|
||||
current_object: The current object being processed
|
||||
"""
|
||||
if not properties_dict:
|
||||
return
|
||||
|
||||
for key, value in list(properties_dict.items()): # Safe iteration during mutation
|
||||
if isinstance(value, dict) and "value" in value:
|
||||
param_name = value.get("name", key)
|
||||
|
||||
# Check based on mode (name or value)
|
||||
if self.check_values:
|
||||
# For value-based actions (like anonymization)
|
||||
param_value = value.get("value", "")
|
||||
if self.action.check(param_value):
|
||||
self.action.apply(value, current_object, properties_dict, key)
|
||||
self.processed_objects.add(current_object.id)
|
||||
else:
|
||||
# For name-based actions (like removal)
|
||||
if self.action.check(param_name):
|
||||
self.action.apply(value, current_object, properties_dict, key)
|
||||
self.processed_objects.add(current_object.id)
|
||||
|
||||
elif isinstance(value, dict):
|
||||
# Recurse into nested dictionaries
|
||||
self.process_properties_dict(value, current_object)
|
||||
|
||||
def process_revit_parameters(self, current_object):
|
||||
"""Process v2 Revit-style parameters to find and apply the action.
|
||||
|
||||
Revit parameters are stored as Base objects with speckle_type 'Objects.BuiltElements.Revit.Parameter'
|
||||
and can be accessed via current_object.parameters.
|
||||
|
||||
Args:
|
||||
current_object: The current object being processed
|
||||
"""
|
||||
if not hasattr(current_object, "parameters") or current_object.parameters is None:
|
||||
return
|
||||
|
||||
parameters = current_object.parameters
|
||||
|
||||
# If parameters is a dictionary rather than a Base object, use it directly
|
||||
if isinstance(parameters, dict):
|
||||
self.process_properties_dict(parameters, current_object)
|
||||
return
|
||||
|
||||
# Get all parameter keys - handle different ways of storing parameters
|
||||
param_keys = []
|
||||
|
||||
# Try get_dynamic_member_names() for Base objects
|
||||
if hasattr(parameters, "get_dynamic_member_names"):
|
||||
param_keys.extend(parameters.get_dynamic_member_names())
|
||||
|
||||
# Try __dict__ for standard attributes
|
||||
if hasattr(parameters, "__dict__"):
|
||||
param_keys.extend(k for k in parameters.__dict__.keys() if not k.startswith("_"))
|
||||
|
||||
# Try dir() as a last resort
|
||||
if not param_keys:
|
||||
param_keys.extend(k for k in dir(parameters) if not k.startswith("_") and k != "get_dynamic_member_names")
|
||||
|
||||
# Process each parameter
|
||||
for parameter_key in param_keys:
|
||||
# Track for debugging
|
||||
self.revit_params_processed += 1
|
||||
|
||||
# Skip known non-parameter attributes
|
||||
if parameter_key in ["speckle_type", "id", "totalChildrenCount"]:
|
||||
continue
|
||||
|
||||
# Get the parameter object using multiple methods
|
||||
param_obj = None
|
||||
param_value = None
|
||||
|
||||
# Try __getitem__ first (common for Revit parameters)
|
||||
try:
|
||||
param_obj = parameters.__getitem__(f"{parameter_key}")
|
||||
except (AttributeError, KeyError, TypeError):
|
||||
try:
|
||||
# Try direct attribute access
|
||||
param_obj = getattr(parameters, parameter_key, None)
|
||||
except KeyError:
|
||||
continue
|
||||
|
||||
# If we couldn't get the parameter, skip it
|
||||
if param_obj is None:
|
||||
continue
|
||||
|
||||
# Prepare a parameter dict with the info we have
|
||||
param_dict = {}
|
||||
|
||||
# Get the name - try from the parameter object first
|
||||
param_name = getattr(param_obj, "name", parameter_key) if isinstance(param_obj, Base) else parameter_key
|
||||
param_dict["name"] = param_name
|
||||
|
||||
# Get the value
|
||||
if isinstance(param_obj, Base) and hasattr(param_obj, "value"):
|
||||
param_value = getattr(param_obj, "value")
|
||||
param_dict["value"] = param_value
|
||||
elif isinstance(param_obj, dict) and "value" in param_obj:
|
||||
param_value = param_obj["value"]
|
||||
param_dict["value"] = param_value
|
||||
else:
|
||||
self.ignore_case = not strict # fallback to global strict setting if no /i flag
|
||||
pattern_body = pattern[1:-1]
|
||||
# If we can't find a value, this might not be a parameter
|
||||
continue
|
||||
|
||||
flags = re.IGNORECASE if self.ignore_case else 0
|
||||
self.regex = re.compile(pattern_body, flags)
|
||||
self.pattern = pattern_body
|
||||
else:
|
||||
self.regex = None
|
||||
self.pattern = pattern
|
||||
self.ignore_case = not strict
|
||||
# Add any other useful metadata
|
||||
param_dict["applicationInternalName"] = parameter_key
|
||||
|
||||
def check(self, param_name: str) -> bool:
|
||||
"""Checks if the parameter name matches the user-defined pattern."""
|
||||
if self.is_regex:
|
||||
return self.regex.search(param_name) is not None
|
||||
# For glob: emulate strict or non-strict
|
||||
if self.ignore_case:
|
||||
return fnmatch.fnmatch(param_name.lower(), self.pattern.lower())
|
||||
else:
|
||||
return fnmatch.fnmatchcase(param_name, self.pattern)
|
||||
# Check based on mode (name or value)
|
||||
if self.check_values:
|
||||
# For value-based actions (like anonymization)
|
||||
if isinstance(param_value, str) and self.action.check(param_value):
|
||||
# Apply the action
|
||||
self.action.apply(param_dict, current_object, parameters, parameter_key)
|
||||
self.processed_objects.add(current_object.id)
|
||||
else:
|
||||
# For name-based actions (like removal)
|
||||
if self.action.check(param_name):
|
||||
# Apply the action
|
||||
self.action.apply(param_dict, current_object, parameters, parameter_key)
|
||||
self.processed_objects.add(current_object.id)
|
||||
|
||||
@@ -51,6 +51,7 @@ class FunctionInputs(AutomateBase):
|
||||
)
|
||||
|
||||
strict_mode: bool = Field(
|
||||
title="Case Sensitivity Strict Mode",
|
||||
default=False,
|
||||
description="If checked, matching is case-sensitive. If unchecked, case-insensitive."
|
||||
)
|
||||
|
||||
@@ -0,0 +1,146 @@
|
||||
"""Module for parameter matching strategies and pattern checking."""
|
||||
|
||||
import fnmatch
|
||||
import re
|
||||
from abc import ABC, abstractmethod
|
||||
from re import Pattern
|
||||
|
||||
|
||||
class ParameterMatcher(ABC):
|
||||
"""Strategy interface for parameter matching logic."""
|
||||
|
||||
def __init__(self, match_value: str, strict_mode: bool = False):
|
||||
"""Initialize with a value to match against and a strict mode flag."""
|
||||
self.match_value = match_value
|
||||
self.strict_mode = strict_mode
|
||||
|
||||
@abstractmethod
|
||||
def matches(self, param_name: str) -> bool:
|
||||
"""Check if parameter name matches according to this strategy."""
|
||||
pass
|
||||
|
||||
|
||||
class PrefixMatcher(ParameterMatcher):
|
||||
"""Matches parameters by prefix."""
|
||||
|
||||
def matches(self, param_name: str) -> bool:
|
||||
"""Check if the parameter name starts with the match value."""
|
||||
if self.strict_mode:
|
||||
return param_name.startswith(self.match_value)
|
||||
return param_name.lower().startswith(self.match_value.lower())
|
||||
|
||||
|
||||
class PatternMatcher(ParameterMatcher):
|
||||
"""Matches parameters by regex pattern."""
|
||||
|
||||
def matches(self, param_name: str) -> bool:
|
||||
"""Check if the parameter name matches the regex pattern."""
|
||||
pattern_checker = PatternChecker(self.match_value, self.strict_mode)
|
||||
return pattern_checker.check(param_name)
|
||||
|
||||
|
||||
class PatternChecker:
|
||||
"""Checks if a parameter name matches a user-defined pattern."""
|
||||
|
||||
def __init__(self, pattern: str, strict: bool = True):
|
||||
"""Initializes the pattern checker.
|
||||
|
||||
Args:
|
||||
pattern: User-defined pattern. Glob by default; /regex/ for regex; /regex/i for ignore-case.
|
||||
strict: Switches case-insensitive matching for both glob and regex (unless overridden by /i in regex).
|
||||
"""
|
||||
self.is_regex = pattern.startswith("/") and (pattern.rstrip("i").endswith("/"))
|
||||
self.user_strict = strict
|
||||
|
||||
if self.is_regex:
|
||||
# Check for inline ignore-case flag
|
||||
if pattern.endswith("/i"):
|
||||
self.ignore_case = True
|
||||
pattern_body = pattern[1:-2]
|
||||
else:
|
||||
self.ignore_case = not strict # fallback to global strict setting if no /i flag
|
||||
pattern_body = pattern[1:-1]
|
||||
|
||||
flags = re.IGNORECASE if self.ignore_case else 0
|
||||
self.regex = re.compile(pattern_body, flags)
|
||||
self.pattern = pattern_body
|
||||
else:
|
||||
self.regex = None
|
||||
self.pattern = pattern
|
||||
self.ignore_case = not strict
|
||||
|
||||
def check(self, param_name: str) -> bool:
|
||||
"""Checks if the parameter name matches the user-defined pattern."""
|
||||
if self.is_regex:
|
||||
return self.regex.search(param_name) is not None
|
||||
# For glob: emulate strict or non-strict
|
||||
if self.ignore_case:
|
||||
return fnmatch.fnmatch(param_name.lower(), self.pattern.lower())
|
||||
else:
|
||||
return fnmatch.fnmatchcase(param_name, self.pattern)
|
||||
|
||||
|
||||
class EmailMatcher:
|
||||
"""Class for identifying and anonymizing email addresses in parameter values."""
|
||||
|
||||
# Email regex pattern - basic pattern to identify email addresses
|
||||
EMAIL_PATTERN = r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}"
|
||||
|
||||
def __init__(self):
|
||||
"""Initialize with a compiled regex pattern for email matching."""
|
||||
self.pattern: Pattern = re.compile(self.EMAIL_PATTERN)
|
||||
|
||||
def contains_email(self, value: str) -> bool:
|
||||
"""Check if a string contains an email address.
|
||||
|
||||
Args:
|
||||
value: The string to check for email addresses
|
||||
|
||||
Returns:
|
||||
bool: True if the string contains an email address, False otherwise
|
||||
"""
|
||||
if not isinstance(value, str):
|
||||
return False
|
||||
|
||||
return bool(self.pattern.search(value))
|
||||
|
||||
def anonymize_email(self, value: str) -> str:
|
||||
"""Anonymize email addresses in a string.
|
||||
|
||||
The function replaces the local part of each email address with the
|
||||
first character followed by asterisks, preserving the domain part.
|
||||
|
||||
Example: "email@example.com" becomes "e****@example.com"
|
||||
|
||||
Args:
|
||||
value: The string containing email addresses to anonymize
|
||||
|
||||
Returns:
|
||||
str: The string with anonymized email addresses
|
||||
"""
|
||||
if not isinstance(value, str):
|
||||
return value
|
||||
|
||||
def replace_email(match_obj):
|
||||
"""Replace function for regex sub to anonymize matched emails."""
|
||||
email = match_obj.group(0)
|
||||
|
||||
# Split the email into local part and domain part
|
||||
local, domain = email.split("@", 1)
|
||||
|
||||
# Anonymize the local part: keep first and last character, replace rest with asterisks
|
||||
if len(local) > 2:
|
||||
# For longer local parts, keep first and last characters
|
||||
anonymized_local = local[0] + "*" * (len(local) - 2) + local[-1]
|
||||
elif len(local) == 2:
|
||||
# For 2-character local parts, show first character and one asterisk
|
||||
anonymized_local = local[0] + "*"
|
||||
else:
|
||||
# For 1-character local parts, just use an asterisk
|
||||
anonymized_local = "*"
|
||||
|
||||
# Return the anonymized email
|
||||
return f"{anonymized_local}@{domain}"
|
||||
|
||||
# Replace all email addresses in the string
|
||||
return self.pattern.sub(replace_email, value)
|
||||
@@ -1,12 +0,0 @@
|
||||
# rule_processor.py
|
||||
def process_rules(speckle_objects: list[Base], rules: list[dict], action_handler) -> dict:
|
||||
"""Process rules against objects and apply actions."""
|
||||
results = {}
|
||||
|
||||
for obj in speckle_objects:
|
||||
for rule in rules:
|
||||
if evaluate_rule(obj, rule):
|
||||
action_handler.apply_action(obj, rule)
|
||||
results[obj.id] = rule["action_type"]
|
||||
|
||||
return results
|
||||
@@ -1,70 +0,0 @@
|
||||
"""A collection of rules for processing parameters in Speckle objects."""
|
||||
from collections.abc import Callable
|
||||
|
||||
from specklepy.objects import Base
|
||||
|
||||
# We're going to define a set of rules that will allow us to filter and
|
||||
# process parameters in our Speckle objects. These rules will be encapsulated
|
||||
# in a class called `ParameterRules`.
|
||||
|
||||
|
||||
class ParameterRules:
|
||||
"""A collection of rules for processing parameters in Speckle objects.
|
||||
|
||||
This class provides static methods that return lambda functions. These
|
||||
lambda functions serve as filters or conditions we can use in our main
|
||||
processing logic. By encapsulating these rules, we can easily extend
|
||||
or modify them in the future.
|
||||
"""
|
||||
|
||||
@staticmethod
|
||||
def speckle_type_rule(desired_type: str) -> Callable[[Base], bool]:
|
||||
"""Rule: Check if a parameter's speckle_type matches the desired type."""
|
||||
return (
|
||||
lambda parameter: getattr(parameter, "speckle_type", None) == desired_type
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def forbidden_prefix_rule(given_prefix: str) -> Callable[[Base], bool]:
|
||||
"""Rule: check if a parameter's name starts with a given prefix.
|
||||
|
||||
This is a simple check, but there could be more complex naming rules for parameters of
|
||||
different types. For example, a rule that checks if a parameter's name starts with a given string
|
||||
exists particularly within IFC where parameters are often prefixed with "Ifc" or "PSet".
|
||||
"""
|
||||
return lambda parameter: parameter.name.startswith(given_prefix)
|
||||
|
||||
# This example Automate function is for prefixed parameter removal. Additional example rules below follow the same
|
||||
# pattern, but with different logic. In some instances there is a strong coupling between the action and the
|
||||
# checking logic, and in others there is a looser coupling. Which is why I have defined the actions separately from
|
||||
# the checking logic.
|
||||
|
||||
@staticmethod
|
||||
def has_missing_value(parameter: dict[str, str]) -> bool:
|
||||
"""Rule: Missing Value Check.
|
||||
|
||||
The AEC industry often requires all parameters to have meaningful values.
|
||||
This rule checks if a parameter is missing its value, potentially indicating
|
||||
an oversight during data entry or transfer.
|
||||
"""
|
||||
return not parameter.get("value")
|
||||
|
||||
@staticmethod
|
||||
def has_default_value(parameter: dict[str, str]) -> bool:
|
||||
"""Rule: Default Value Check.
|
||||
|
||||
Default values can sometimes creep into final datasets due to software defaults.
|
||||
This rule identifies parameters that still have their default values, helping
|
||||
to highlight areas where real, meaningful values need to be provided.
|
||||
"""
|
||||
return parameter.get("value") == "Default"
|
||||
|
||||
@staticmethod
|
||||
def parameter_exists(parameter_name: str, parent_object: dict[str, str]) -> bool:
|
||||
"""Rule: Parameter Existence Check.
|
||||
|
||||
For certain critical parameters, their mere presence (or lack thereof) is vital.
|
||||
This rule verifies if a specific parameter exists within an object, allowing
|
||||
teams to ensure that key data points are always present.
|
||||
"""
|
||||
return parameter_name in parent_object.get("parameters", {})
|
||||
@@ -1,9 +1,10 @@
|
||||
"""This module defines a function that generates traversal rules for navigating."""
|
||||
|
||||
from specklepy.objects.graph_traversal.traversal import GraphTraversal, TraversalRule
|
||||
|
||||
|
||||
def get_data_traversal_rules() -> GraphTraversal:
|
||||
"""
|
||||
Generates traversal rules for navigating Speckle data structures.
|
||||
"""Generates traversal rules for navigating Speckle data structures.
|
||||
|
||||
This function defines and returns traversal rules tailored for Speckle data.
|
||||
These rules are used to navigate and extract specific data properties
|
||||
@@ -33,9 +34,7 @@ def get_data_traversal_rules() -> GraphTraversal:
|
||||
|
||||
display_value_rule = TraversalRule(
|
||||
[
|
||||
lambda o: any(
|
||||
getattr(o, alias, None) for alias in display_value_property_aliases
|
||||
),
|
||||
lambda o: any(getattr(o, alias, None) for alias in display_value_property_aliases),
|
||||
lambda o: "Geometry" in o.speckle_type,
|
||||
],
|
||||
lambda o: elements_property_aliases,
|
||||
@@ -46,4 +45,4 @@ def get_data_traversal_rules() -> GraphTraversal:
|
||||
lambda o: o.get_member_names(),
|
||||
)
|
||||
|
||||
return GraphTraversal([display_value_rule, default_rule])
|
||||
return GraphTraversal([display_value_rule, default_rule])
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
"""Run integration tests with a speckle server."""
|
||||
|
||||
from speckle_automate import (
|
||||
AutomationContext,
|
||||
AutomationRunData,
|
||||
@@ -12,6 +13,7 @@ from data_shield.function import FunctionInputs, SanitizationMode, automate_func
|
||||
|
||||
class TestFunction:
|
||||
"""Test the automate function."""
|
||||
|
||||
def test_function_run(self, test_automation_run_data: AutomationRunData, test_automation_token: str) -> None:
|
||||
"""Run an integration test for the automate function."""
|
||||
automation_context = AutomationContext.initialize(test_automation_run_data, test_automation_token)
|
||||
@@ -20,9 +22,9 @@ class TestFunction:
|
||||
automation_context,
|
||||
automate_function,
|
||||
FunctionInputs(
|
||||
sanitization_mode=SanitizationMode.PATTERN_MATCHING,
|
||||
parameter_input="/.*?peckl.*/i",
|
||||
strict_mode=True,
|
||||
sanitization_mode=SanitizationMode.ANONYMIZATION,
|
||||
parameter_input="SPECKLE",
|
||||
strict_mode=False,
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user