Filtering Sensitive Data

LogStruct provides comprehensive protection for sensitive data through parameter filtering and string scrubbing, keeping your logs secure while still providing useful information for debugging.

Parameter Filtering

LogStruct automatically filters sensitive data in request parameters, job arguments, and other structured data based on key names. When a sensitive key is detected, the actual value is replaced with metadata instead.

How Parameter Filtering Works

When LogStruct encounters a key that matches one of the configured sensitive keys, it replaces the value with metadata that provides context without exposing sensitive information:

# Original data
{
  email: "user@example.com",
  password: "secret123",
  user_data: { name: "John Doe", age: 30 }
}

# After filtering
{
  email: { _filtered: { _class: "String", _hash: "a1b2c3d4e5f6" } },
  password: { _filtered: { _class: "String" } },
  user_data: { name: "John Doe", age: 30 }
}

For different data types, LogStruct provides different types of metadata:

  • Strings: Shows class name, but omits byte size for sensitive keys
  • Hashable strings: For keys configured in filter_keys_with_hashes (like email addresses), includes a hash for tracing across logs
  • Hashes: Shows class name, key count, and first 10 keys (but hides byte size if sensitive keys are present)
  • Arrays: Shows class name, count, and byte size

Default Filtered Keys

LogStruct filters these keys by default:

# Passwords and authentication
:password, :password_confirmation, :pass, :pw
:token, :secret
:credentials, :auth, :authentication, :authorization

# Sensitive personal information
:credit_card, :ssn, :social_security

Email Hashing for Request Tracing

For email addresses, LogStruct provides special handling by generating a consistent hash that allows tracing user activity across different log entries while still protecting personal information:

# These keys have hashed values by default
:email, :email_address

# Example of a log with hashed email
{
  email: { _filtered: { _class: "String", _hash: "a1b2c3d4e5f6" } }
}

Configuring Parameter Filtering

You can customize which keys are filtered and which keys should include hashes:

LogStruct.configure do |config|
  # Configure which params should be filtered
  config.filters.filter_keys = [
    :password, :password_confirmation, :token, :secret,
    :credit_card, :ssn, :social_security
  ]

  # Configure which params should include hashes for values
  config.filters.filter_keys_with_hashes = [
    :email, :email_address
  ]

  # Configure sensitive data filtering for all strings
  config.filters.email_addresses = true      # Filter email addresses
  config.filters.url_passwords = true        # Filter passwords in URLs
  config.filters.credit_card_numbers = true  # Filter credit card numbers
  config.filters.phone_numbers = true        # Filter phone numbers
  config.filters.ssns = true                 # Filter social security numbers
  config.filters.ip_addresses = false        # Filter IP addresses (off by default)
  config.filters.mac_addresses = false       # Filter MAC addresses (off by default)

  # Configure the salt used for hashing filtered email addresses
  config.filters.hash_salt = ENV.fetch("EMAIL_HASH_SALT", "test_salt")

  # Configure the length of hash output for filtered emails (default: 12)
  config.filters.hash_length = 12
end

String Scrubbing

In addition to filtering based on key names, LogStruct automatically scans all string values for patterns that might contain sensitive information, regardless of the key they're associated with.

Special thanks to ankane for creating the logstop gem. We use a vendored fork of the logstop formatter code with some modifications.

How String Scrubbing Works

String scrubbing uses regular expressions to identify and replace sensitive data patterns with descriptive placeholders:

# Original log message
"User user@example.com created credit card 4111-1111-1111-1111"

# After string scrubbing
"User [EMAIL:a1b2c3d4e5f6] created credit card [CREDIT_CARD]"

Types of Scrubbed Data

LogStruct can detect and scrub the following types of sensitive data:

  • Email addresses: Replaced with [EMAIL:hash]
  • Passwords in URLs: Replaced with [PASSWORD]
  • Credit card numbers: Replaced with [CREDIT_CARD]
  • Phone numbers: Replaced with [PHONE]
  • Social security numbers: Replaced with [SSN]
  • IP addresses: Replaced with [IP] (disabled by default)
  • MAC addresses: Replaced with [MAC] (disabled by default)

Configuring String Scrubbing

You can enable or disable specific scrubbers as part of the filter configuration:

LogStruct.configure do |config|
  # Configure which params should be filtered
  config.filters.filter_keys = [
    :password, :password_confirmation, :token, :secret,
    :credit_card, :ssn, :social_security
  ]

  # Configure which params should include hashes for values
  config.filters.filter_keys_with_hashes = [
    :email, :email_address
  ]

  # Configure sensitive data filtering for all strings
  config.filters.email_addresses = true      # Filter email addresses
  config.filters.url_passwords = true        # Filter passwords in URLs
  config.filters.credit_card_numbers = true  # Filter credit card numbers
  config.filters.phone_numbers = true        # Filter phone numbers
  config.filters.ssns = true                 # Filter social security numbers
  config.filters.ip_addresses = false        # Filter IP addresses (off by default)
  config.filters.mac_addresses = false       # Filter MAC addresses (off by default)

  # Configure the salt used for hashing filtered email addresses
  config.filters.hash_salt = ENV.fetch("EMAIL_HASH_SALT", "test_salt")

  # Configure the length of hash output for filtered emails (default: 12)
  config.filters.hash_length = 12
end

Custom String Scrubbing

For data patterns not covered by the built-in scrubbers, you can implement a custom string scrubbing handler:

# Set a custom string scrubbing handler that will be called
# after the built-in scrubbers run
LogStruct.configure do |config|
  config.string_scrubbing_handler = T.let(->(value) {
    # Custom string scrubbing logic here
    # Example: Remove all bank account numbers that match the pattern
    value.gsub(/\b\d{10,12}\b/, "[BANK_ACCOUNT]")
  },
    LogStruct::Handlers::StringScrubber)
end

Examples

Here are examples of how LogStruct filters and scrubs sensitive data in different scenarios:

Filtered Hash with Sensitive Keys

# Original hash
{
  user_id: 123,
  email: "user@example.com",
  password: "secret123",
  profile: {
    name: "John Doe",
    phone: "555-123-4567"
  }
}

# Logged output
{
  "user_id": 123,
  "email": {
    "_filtered": {
      "_class": "String",
      "_hash": "a1b2c3d4e5f6"
    }
  },
  "password": {
    "_filtered": {
      "_class": "String"
    }
  },
  "profile": {
    "name": "John Doe",
    "phone": "[PHONE]"
  }
}

Filtered Array with Sensitive Data

# Original array
[
  "Public message",
  "Email: user@example.com",
  "Password: secret123",
  { user_id: 123, email: "another@example.com" }
]

# Logged output
[
  "Public message",
  "Email: [EMAIL:a1b2c3d4e5f6]",
  "Password: [PASSWORD]",
  {
    "user_id": 123,
    "email": {
      "_filtered": {
        "_class": "String",
        "_hash": "g7h8i9j0k1l2"
      }
    }
  }
]

Sensitive URL in Log Message

# Original message
"Connecting to database at https://user:password123@db.example.com:5432/mydb"

# Logged output
"Connecting to database at https://user:[PASSWORD]@db.example.com:5432/mydb"

Complex Object with Nested Sensitive Data

# Original object
{
  request: {
    method: "POST",
    path: "/api/users",
    params: {
      user: {
        name: "Jane Smith",
        email: "jane@example.com",
        password: "secure_password",
        credit_card: "4111-1111-1111-1111"
      }
    },
    headers: {
      "Authorization": "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
    }
  },
  response: {
    status: 201,
    body: {
      id: 456,
      name: "Jane Smith",
      email: "jane@example.com"
    }
  }
}

# Logged output
{
  "request": {
    "method": "POST",
    "path": "/api/users",
    "params": {
      "user": {
        "name": "Jane Smith",
        "email": {
          "_filtered": {
            "_class": "String",
            "_hash": "m3n4o5p6q7r8"
          }
        },
        "password": {
          "_filtered": {
            "_class": "String"
          }
        },
        "credit_card": {
          "_filtered": {
            "_class": "String"
          }
        }
      }
    },
    "headers": {
      "Authorization": {
        "_filtered": {
          "_class": "String"
        }
      }
    }
  },
  "response": {
    "status": 201,
    "body": {
      "id": 456,
      "name": "Jane Smith",
      "email": "[EMAIL:m3n4o5p6q7r8]"
    }
  }
}