7 Commits

Author SHA1 Message Date
dependabot[bot] 3e2462c127 Bump actions/setup-python from 5 to 6
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-04 23:12:11 +00:00
Jonathon Broughton be1ac7b9bb Update README.md 2025-03-25 01:54:56 +00:00
Jonathon Broughton 5c09e22358 Refactor anonymization logic and improve checks
build and deploy Speckle functions / publish-automate-function-version (push) Has been cancelled
- Simplified email check method to ensure param_value is a string.
- Streamlined apply method for better handling of parameter values.
- Enhanced error handling when accessing parameters in Base objects.
- Added debug counters for processed objects in the ParameterProcessor class.
- Updated test cases to reflect changes in input parameters.
2025-03-25 01:49:29 +00:00
Jonathon Broughton eafc75c2f7 Update README.md 2025-03-25 01:23:55 +00:00
Jonathon Broughton f7b72d4516 Update parameter handling for v2 Revit
- Changed legacy placeholder to handle v2 parameters
- Added processing for Revit parameters in the current object
2025-03-25 01:22:45 +00:00
Jonathon Broughton a865bd64db Update .gitignore to exclude new env file
Added exclusion for the .env-v3 file to keep it out of version control.
2025-03-25 01:16:48 +00:00
Jonathon Broughton 6958cc423b Update README for Data Shield features
- Renamed "Sanitization Modes" to "Shield Modes"
- Changed trigger description to "trigger model"
- Updated running instructions for clarity
- Adjusted terminology from "sanitized models" to "shielded models"
- Added tips section with regex advice and workflow examples
2025-03-25 01:05:13 +00:00
6 changed files with 135 additions and 91 deletions
+1 -1
View File
@@ -12,7 +12,7 @@ jobs:
runs-on: ubuntu-latest runs-on: ubuntu-latest
steps: steps:
- uses: actions/checkout@v4.2.2 - uses: actions/checkout@v4.2.2
- uses: actions/setup-python@v5 - uses: actions/setup-python@v6
with: with:
python-version: '3.13' python-version: '3.13'
+2
View File
@@ -313,3 +313,5 @@ pyrightconfig.json
.ionide .ionide
# End of https://www.toptal.com/developers/gitignore/api/visualstudiocode,python,pycharm # End of https://www.toptal.com/developers/gitignore/api/visualstudiocode,python,pycharm
.env-v3
+17 -17
View File
@@ -1,8 +1,6 @@
# 🛡️ Data Shield — User Guide # 🛡️ Data Shield — User Guide
**Data Shield** is a Speckle Automate function that helps you keep your model data clean, safe, and share-ready. Whether you're sending models to clients, collaborators, or just tidying up before archiving Data Shields got your back. **Data Shield** is a Speckle Automate function that helps you keep your model data clean, safe, and share-ready. Whether you're sending models to clients or collaborators or just tidying up before archiving, Data Shield has your back.
---
## ✨ What Data Shield Does ## ✨ What Data Shield Does
@@ -15,7 +13,7 @@ Data Shield scans your Speckle model for parameters youd rather not share and
--- ---
## Sanitization Modes ## Shield Modes
We know one size doesnt fit all, so Data Shield offers three modes to suit your style: We know one size doesnt fit all, so Data Shield offers three modes to suit your style:
@@ -23,7 +21,7 @@ We know one size doesnt fit all, so Data Shield offers three modes to suit yo
> **Best for:** Simple, predictable naming conventions. > **Best for:** Simple, predictable naming conventions.
Remove parameters that start with a specific prefix. Remove parameters that start with a specific prefix.
> Example: Want to remove everything starting with `secret_`? Just set that prefix and Data Shield does the rest. > Example: Want to remove everything starting with `secret_`? Just set that prefix, and Data Shield will do the rest.
**Setup**: **Setup**:
- Add your prefix (like `internal_`, `private_`, or `secret_`) - Add your prefix (like `internal_`, `private_`, or `secret_`)
@@ -44,12 +42,15 @@ Get fancy and use `*`, `?`, or full regular expressions.
--- ---
### Anonymization ### Anonymization
> **Best for:** Keeping the structure, hiding the details. > **Best for:** Keeping the structure and hiding the details.
Automatically detect email addresses inside parameter values and anonymize them. Automatically detect email addresses inside parameter values and anonymize them.
> Example: `john.doe@example.com` becomes `j***@example.com` > Example:
>
> ![image](https://github.com/user-attachments/assets/076e4acd-2257-4ebd-b82b-c151e51c00c0)
No setup needed. Just select and go.
No setup is needed. Just select and go.
--- ---
@@ -58,7 +59,7 @@ No setup needed. Just select and go.
1. **Set up your automation:** 1. **Set up your automation:**
- In your Speckle project, head to **Automations** - In your Speckle project, head to **Automations**
- Click **Add Automation** and choose **Data Shield** - Click **Add Automation** and choose **Data Shield**
- Set your trigger (like “on new commit”) - Set your trigger model
2. **Configure your mode:** 2. **Configure your mode:**
- Choose Prefix, Pattern, or Anonymization - Choose Prefix, Pattern, or Anonymization
@@ -66,24 +67,22 @@ No setup needed. Just select and go.
- Toggle strict mode if you want case sensitivity - Toggle strict mode if you want case sensitivity
3. **Run it:** 3. **Run it:**
- Itll run automatically when triggered — or you can manually run on specific commits - Itll run automatically when a new version is published — or you can manually run it
4. **Check results:** 4. **Check results:**
- Sanitized models show up under the `processed/` branch - Shielded models show up under the `processed/` branch
- Youll get a run report showing what got cleaned - Youll get a run report showing what got cleaned
- Highlighted changes can be seen directly in the viewer - Highlighted changes can be seen directly in the viewer
::: 💡 Tips & Tricks ---
## 💡 Tips & Tricks
- **Test first!** — Run it on a small test model before going full production. - **Test first!** — Run it on a small test model before going full production.
- **Start simple.** Use prefix matching for clear conventions, pattern matching for complexity, or anonymization for safe sharing. - **Start simple.** Use prefix matching for clear conventions, pattern matching for complexity, or anonymization for safe sharing.
- **Regex pro tip:** - **Regex pro tip:**
- Wrap your regex in `/` - Wrap your regex in `/`
- Add `i` for case-insensitive matching - Add `i` for case-insensitive matching
- Use `^` (start) and `$` (end) for tighter control - Use `^` (start) and `$` (end) for tighter control
::: ---
## 📚 Example Workflows ## 📚 Example Workflows
### → Prepping for external sharing ### → Prepping for external sharing
@@ -92,7 +91,7 @@ No setup needed. Just select and go.
- Share confidently! - Share confidently!
### → Anonymizing client data ### → Anonymizing client data
- Select Anonymization mode - Select the Anonymization mode
- Run on any models with contact details - Run on any models with contact details
- Use sanitized versions for demos, public decks, or sales pitches - Use sanitized versions for demos, public decks, or sales pitches
@@ -108,6 +107,7 @@ No setup needed. Just select and go.
- **Case mismatch?** Try turning off strict mode. - **Case mismatch?** Try turning off strict mode.
- **Only partly sanitized?** Some complex models might need multiple passes. - **Only partly sanitized?** Some complex models might need multiple passes.
- **Errors?** Check run logs in the automation report for clues. - **Errors?** Check run logs in the automation report for clues.
- **Next Gen vs Legacy**: While v3 data objects are supported, if you're using non-Revit v2 objects, you might experience varied results. Please report any issues.
--- ---
+35 -45
View File
@@ -150,75 +150,65 @@ class AnonymizationAction(ParameterAction):
"""Initialize the anonymization action with an email matcher.""" """Initialize the anonymization action with an email matcher."""
super().__init__() super().__init__()
self.email_matcher = EmailMatcher() self.email_matcher = EmailMatcher()
# Count of anonymized parameters for reporting
self.anonymized_count = 0 self.anonymized_count = 0
def check(self, param_value: str) -> bool: def check(self, param_value: str) -> bool:
"""Check if parameter value contains an email address. """Check if parameter value contains an email address."""
return isinstance(param_value, str) and self.email_matcher.contains_email(param_value)
Args:
param_value: The parameter value to check
Returns:
bool: True if the parameter value contains an email address, False otherwise
"""
return self.email_matcher.contains_email(param_value)
def apply( def apply(
self, parameter: dict[str, Any], parent_object: Base, containing_dict: dict[str, Any] | Base, parameter_key: str self, parameter: dict[str, Any], parent_object: Base, containing_dict: dict[str, Any] | Base, parameter_key: str
) -> None: ) -> None:
"""Anonymize email addresses in the parameter value. """Anonymize email addresses in the parameter value."""
# Get parameter name and object ID - same as RemovalAction
Args:
parameter: The parameter dictionary
parent_object: The parent Speckle object
containing_dict: The container (dict or Base object) holding the parameter
parameter_key: The key or attribute name of the parameter
"""
if "value" not in parameter or not isinstance(parameter["value"], str):
return
param_name = parameter.get("name", parameter_key) param_name = parameter.get("name", parameter_key)
original_value = parameter["value"]
object_id = getattr(parent_object, "id", None) object_id = getattr(parent_object, "id", None)
# Anonymize email addresses in the parameter value # Get the value to anonymize
anonymized_value = self.email_matcher.anonymize_email(original_value) param_value = None
# Only track changes if something was actually anonymized # For dictionary-style parameters
if anonymized_value != original_value: if isinstance(parameter, dict) and "value" in parameter:
# Update the parameter value in place param_value = parameter["value"]
parameter["value"] = anonymized_value if isinstance(param_value, str) and self.email_matcher.contains_email(param_value):
# Anonymize and update
anonymized_value = self.email_matcher.anonymize_email(param_value)
parameter["value"] = anonymized_value
# If we're dealing with a Base object parameter (like in Revit), # Track affected parameters - EXACTLY like RemovalAction does
# update the actual value property of the parameter object self.affected_parameters[object_id].append(param_name)
if isinstance(containing_dict, Base): self.anonymized_count += 1
# For Base object parameters (like in Revit)
elif isinstance(containing_dict, Base):
try:
# Try to get the parameter object
param_obj = None
try: try:
# Try to get the parameter object using __getitem__ first (Revit v2 style)
param_obj = containing_dict.__getitem__(parameter_key) param_obj = containing_dict.__getitem__(parameter_key)
if param_obj is not None and hasattr(param_obj, "value"):
setattr(param_obj, "value", anonymized_value)
except (AttributeError, KeyError, TypeError): except (AttributeError, KeyError, TypeError):
# Fallback to standard attribute access
param_obj = getattr(containing_dict, parameter_key, None) param_obj = getattr(containing_dict, parameter_key, None)
if param_obj is not None and hasattr(param_obj, "value"):
if param_obj and hasattr(param_obj, "value"):
param_value = getattr(param_obj, "value")
if isinstance(param_value, str) and self.email_matcher.contains_email(param_value):
# Anonymize and update
anonymized_value = self.email_matcher.anonymize_email(param_value)
setattr(param_obj, "value", anonymized_value) setattr(param_obj, "value", anonymized_value)
# Track affected object and parameter # Track affected parameters - EXACTLY like RemovalAction does
self.affected_parameters[object_id].append(param_name) self.affected_parameters[object_id].append(param_name)
self.anonymized_count += 1 self.anonymized_count += 1
except KeyError:
pass # Skip if any error occurs
def report(self, automate_context: AutomationContext) -> None: def report(self, automate_context: AutomationContext) -> None:
"""Provide feedback based on the action's results. """Provide feedback based on the action's results."""
Args:
automate_context: The automation context
"""
if not self.affected_parameters: if not self.affected_parameters:
return return
# Copy the exact pattern from RemovalAction for consistency
anonymized_params = set(param for params in self.affected_parameters.values() for param in params) anonymized_params = set(param for params in self.affected_parameters.values() for param in params)
message = f"Email addresses were anonymized in {len(anonymized_params)} parameters" message = f"Email addresses were anonymized in {len(anonymized_params)} parameters"
automate_context.attach_info_to_objects( automate_context.attach_info_to_objects(
+76 -26
View File
@@ -5,7 +5,6 @@ from specklepy.objects import Base
from data_shield.actions import ParameterAction from data_shield.actions import ParameterAction
# Modified ParameterProcessor class imported from processor_update.py
class ParameterProcessor: class ParameterProcessor:
"""Class to handle parameter processing with various actions.""" """Class to handle parameter processing with various actions."""
@@ -19,6 +18,9 @@ class ParameterProcessor:
self.action = action self.action = action
self.check_values = check_values self.check_values = check_values
self.processed_objects = set() self.processed_objects = set()
# Debug counters
self.total_objects_processed = 0
self.revit_params_processed = 0
def process_context(self, context): def process_context(self, context):
"""Process a traversal context to handle parameters and properties. """Process a traversal context to handle parameters and properties.
@@ -27,8 +29,9 @@ class ParameterProcessor:
context: The traversal context containing the current object context: The traversal context containing the current object
""" """
current_object = context.current current_object = context.current
self.total_objects_processed += 1
# Prioritise v3 # First handle modern v3 properties
if hasattr(current_object, "properties") and current_object.properties is not None: if hasattr(current_object, "properties") and current_object.properties is not None:
properties_dict = ( properties_dict = (
current_object.properties.__dict__ current_object.properties.__dict__
@@ -37,9 +40,9 @@ class ParameterProcessor:
) )
self.process_properties_dict(properties_dict, current_object) self.process_properties_dict(properties_dict, current_object)
# Legacy placeholder for v2, ready for later # Then handle legacy v2 Revit parameters
if hasattr(current_object, "parameters") and current_object.parameters is not None: if hasattr(current_object, "parameters") and current_object.parameters is not None:
pass # Add v2 handling when ready self.process_revit_parameters(current_object)
def process_properties_dict(self, properties_dict, current_object): def process_properties_dict(self, properties_dict, current_object):
"""Recursively process v3-style properties dictionary to find and apply the action to parameters. """Recursively process v3-style properties dictionary to find and apply the action to parameters.
@@ -48,6 +51,9 @@ class ParameterProcessor:
properties_dict: The properties dictionary to process properties_dict: The properties dictionary to process
current_object: The current object being processed current_object: The current object being processed
""" """
if not properties_dict:
return
for key, value in list(properties_dict.items()): # Safe iteration during mutation for key, value in list(properties_dict.items()): # Safe iteration during mutation
if isinstance(value, dict) and "value" in value: if isinstance(value, dict) and "value" in value:
param_name = value.get("name", key) param_name = value.get("name", key)
@@ -55,7 +61,8 @@ class ParameterProcessor:
# Check based on mode (name or value) # Check based on mode (name or value)
if self.check_values: if self.check_values:
# For value-based actions (like anonymization) # For value-based actions (like anonymization)
if self.action.check(value.get("value", "")): param_value = value.get("value", "")
if self.action.check(param_value):
self.action.apply(value, current_object, properties_dict, key) self.action.apply(value, current_object, properties_dict, key)
self.processed_objects.add(current_object.id) self.processed_objects.add(current_object.id)
else: else:
@@ -82,41 +89,84 @@ class ParameterProcessor:
parameters = current_object.parameters parameters = current_object.parameters
# Use get_dynamic_member_names() to get all parameter keys # If parameters is a dictionary rather than a Base object, use it directly
for parameter_key in parameters.get_dynamic_member_names(): if isinstance(parameters, dict):
# Get the parameter object using __getitem__ self.process_properties_dict(parameters, current_object)
return
# Get all parameter keys - handle different ways of storing parameters
param_keys = []
# Try get_dynamic_member_names() for Base objects
if hasattr(parameters, "get_dynamic_member_names"):
param_keys.extend(parameters.get_dynamic_member_names())
# Try __dict__ for standard attributes
if hasattr(parameters, "__dict__"):
param_keys.extend(k for k in parameters.__dict__.keys() if not k.startswith("_"))
# Try dir() as a last resort
if not param_keys:
param_keys.extend(k for k in dir(parameters) if not k.startswith("_") and k != "get_dynamic_member_names")
# Process each parameter
for parameter_key in param_keys:
# Track for debugging
self.revit_params_processed += 1
# Skip known non-parameter attributes
if parameter_key in ["speckle_type", "id", "totalChildrenCount"]:
continue
# Get the parameter object using multiple methods
param_obj = None
param_value = None
# Try __getitem__ first (common for Revit parameters)
try: try:
param_obj = parameters.__getitem__(f"{parameter_key}") param_obj = parameters.__getitem__(f"{parameter_key}")
except KeyError: except (AttributeError, KeyError, TypeError):
continue try:
# Check if it's a Revit parameter # Try direct attribute access
if ( param_obj = getattr(parameters, parameter_key, None)
not isinstance(param_obj, Base) except KeyError:
or getattr(param_obj, "speckle_type", "") != "Objects.BuiltElements.Revit.Parameter" continue
):
# If we couldn't get the parameter, skip it
if param_obj is None:
continue continue
# For name-based checks, we need to check both the name property and applicationInternalName # Prepare a parameter dict with the info we have
name_to_check = getattr(param_obj, "name", "") param_dict = {}
value_to_check = getattr(param_obj, "value", "")
# Create a parameter dict to pass to the action # Get the name - try from the parameter object first
param_dict = { param_name = getattr(param_obj, "name", parameter_key) if isinstance(param_obj, Base) else parameter_key
"name": name_to_check, param_dict["name"] = param_name
"value": value_to_check,
"applicationInternalName": parameter_key, # Get the value
} if isinstance(param_obj, Base) and hasattr(param_obj, "value"):
param_value = getattr(param_obj, "value")
param_dict["value"] = param_value
elif isinstance(param_obj, dict) and "value" in param_obj:
param_value = param_obj["value"]
param_dict["value"] = param_value
else:
# If we can't find a value, this might not be a parameter
continue
# Add any other useful metadata
param_dict["applicationInternalName"] = parameter_key
# Check based on mode (name or value) # Check based on mode (name or value)
if self.check_values: if self.check_values:
# For value-based actions (like anonymization) # For value-based actions (like anonymization)
if isinstance(value_to_check, str) and self.action.check(value_to_check): if isinstance(param_value, str) and self.action.check(param_value):
# Apply the action # Apply the action
self.action.apply(param_dict, current_object, parameters, parameter_key) self.action.apply(param_dict, current_object, parameters, parameter_key)
self.processed_objects.add(current_object.id) self.processed_objects.add(current_object.id)
else: else:
# For name-based actions (like removal) # For name-based actions (like removal)
if self.action.check(name_to_check): if self.action.check(param_name):
# Apply the action # Apply the action
self.action.apply(param_dict, current_object, parameters, parameter_key) self.action.apply(param_dict, current_object, parameters, parameter_key)
self.processed_objects.add(current_object.id) self.processed_objects.add(current_object.id)
+4 -2
View File
@@ -1,4 +1,5 @@
"""Run integration tests with a speckle server.""" """Run integration tests with a speckle server."""
from speckle_automate import ( from speckle_automate import (
AutomationContext, AutomationContext,
AutomationRunData, AutomationRunData,
@@ -12,6 +13,7 @@ from data_shield.function import FunctionInputs, SanitizationMode, automate_func
class TestFunction: class TestFunction:
"""Test the automate function.""" """Test the automate function."""
def test_function_run(self, test_automation_run_data: AutomationRunData, test_automation_token: str) -> None: def test_function_run(self, test_automation_run_data: AutomationRunData, test_automation_token: str) -> None:
"""Run an integration test for the automate function.""" """Run an integration test for the automate function."""
automation_context = AutomationContext.initialize(test_automation_run_data, test_automation_token) automation_context = AutomationContext.initialize(test_automation_run_data, test_automation_token)
@@ -21,8 +23,8 @@ class TestFunction:
automate_function, automate_function,
FunctionInputs( FunctionInputs(
sanitization_mode=SanitizationMode.ANONYMIZATION, sanitization_mode=SanitizationMode.ANONYMIZATION,
parameter_input="", parameter_input="SPECKLE",
strict_mode=True, strict_mode=False,
), ),
) )