Categories
All Softwares
Sublime Text VSCode Binaries Postman TeamViewer Firebase Visual Studio Code Contentful Tribe Circle Notion Datadog NewRelic Vmix Archbee Docker Desktop Bitly GitHub BitBucket Upsource Zapier Make Medium substack Facebook Amazon S3 Maya 3DS MAX Adobe Substance Airtable Roam Research Azure DevOps Retool Powerapps Appsheet 1Password Plex Emby Netflix Apple HomeKit Algolia Lightshot Confluence Toad tableau Data Studio Salesforce SAP Calendly Google photos Bloomberg Terminal BigQuery ML Google AutoML Tables Shopify BigCommerce Google Drive Redis Memcached Windows media player WhatsApp Heroku Render Looker Quizlet Google Analytics Auth0 Trello Elasticsearch Adobe Premiere Pro Zerotier Zoom Skype Docker Polypane Google Chrome Microsoft Edge Safari Gitbook Gmail Google Vertex AI Kdb+ Amplitude Google Docs Typora Roboflow ML Kit Azure Intercom Quicken YNAB Uptime Robot Figma npm TigerGraph Amazon Neptune Fivetran Okta YouTube LastPass Mailchimp Sendinblue Adobe Acrobat Pocket Reddit Onenote Shogun DaVinci Resolve UiPath Taliscale Adobe Lightroom FullStory LogRocket RescueTime Boxcryptor LaunchDarkly ArcGIS AWS SageMaker Tailscale NordVPN WooCommerce Twitter Dropbox Nagios Zabbix Prtg Google Cloud Webflow ActiveCampaign Quickbooks .Net Maui Airplane.dev Pipedream Evernote Autodesk AutoCAD HCL Connections Google Sheets Excel Rundeck Ansible Tower Salt Twilio Pastebin Zoho Unity3D GameMaker AWS Config GCP Cloud Asset inventory AWS GuardDuty Unreal Engine (UE4) Jira YouTrack Stytch Suite CRM Greynoise Photoshop LinkTree BlackBoard Zendesk Discord Rollout.io Disqus Oracle Fusion ERP Cloud Odoo Microsoft Dynamics Alfred Sophos Firewall UniFi Security Gateway Azure AD Doodle Office Online Power BI MicroStrategy Qlik Ampache Socrata Drone CI IOS WordPress IDM FDM Ninja Download Manager McAfee Google Meet WIX cPanel LucidChart HubSpot Landbot Typeform CCleaner Ecwid Spotify Stackstrom N8N Substance Painter Onshape SketchUp Canny Miro XMind Segment GoogleForms Adobe Illustrator MultiSim Proteus Prezi Slack Microsoft Teams SumSub JAWS Wetransfer Framer Microsoft 365 Telegram Threema Signal Lokalise Crowdin Phrase WolframAlpha Dataclay Templater Bot WorkOS FrontEgg Snorkel AI ZohoCRM Voicemod Chromatic Percy POEditor Transifex Microsoft Office Selenium vBulletin Xenforo Hightouch Logseq Bundlephobia Webpack Esbuild Rollup Session Berty WHMCS Stripe Billing Google Camera ImgIX Netlify Google Keep SocialPilot Hootsuite Firebase Analytics Access Manager Wordle Amazon Redshift Snowflake Microsoft Active Directory ClubHouse Tenable Nessus Obsidian Scrivener IDA Neo4j Pushbullet Pushover TinkerCAD Fusion360 SolidWorks TablePlus Cryptomator Glasswire Comodo Firewall Coyim Splunk Hungry Bring Panther IFTTT openHAB Alexa Google Home Twitch Asana IBM Watson Discovery FL Studio Ableton Google Maps Gather Aseprite Instagram Agora Wowza Docuware ELO Office Apollo GraphQL Supabase Hasura Stepzen Postgraphile Lyket.dev Kahoot Clubdesk Fairgate Bandicam Revoltchat Element Imply Pinot MongoDB Oracle Peoplesoft CurseForge Google Tag Manager MS SQL AppWrite Nhost AWS Kendra QnA Maker Apigee Google Cloud IoT Core Microsoft OneNote Amazon API Gateway Qualtrics Sprig Hotjar Sibelius Finale Dorico Snyk Common Room Orbit Toggl Track Adobe Scan Microsoft Lens CamScanner Vercel Stack Overflow Traktor Pro 3 Markup CMS Documentation Atlassian Confluence Raindrop Akeneo Salsify Informatica SuiteCRM VtigerCRM Cruise Tesla autopilot Waymo Adobe Animate Pencil2D Men&Mice Solarwinds Infoblox Device42 AWS WAF
Matano

Matano

Open Source Alternative to Splunk, Elasticsearch, Panther
Language
Rust
Stars
1552
Watchers
1552
Forks
112
Open Issues
54
Last Updated
4/30/2025

REAMDE.md

Twitter Follow

Open source security data lake for AWS

Matano Open Source Security data lake is an open source cloud-native security data lake, built for security teams on AWS.

[!NOTE] Matano offers a commercial managed Cloud SIEM for a complete enterprise Security Operations platform. Learn more.

Features



  • Security Data Lake: Normalize unstructured security logs into a structured realtime data lake in your AWS account.
  • Collect All Your Logs: Integrates out of the box with 50+ sources for security logs and can easily be extended with custom sources.
  • Detection-as-Code: Use Python to build realtime detections as code. Support for automatic import of Sigma detections to Matano.
  • Log Transformation Pipeline: Supports custom VRL (Vector Remap Language) scripting to parse, enrich, normalize and transform your logs as they are ingested without managing any servers.
  • No Vendor Lock-In: Uses an open table format (Apache Iceberg) and open schema standards (ECS), to give you full ownership of your security data in a vendor-neutral format.
  • Bring Your Own Analytics: Query your security lake directly from any Iceberg-compatible engine (AWS Athena, Snowflake, Spark, Trino etc.) without having to copy data around.
  • Serverless: Fully serverless and designed specifically for AWS and focuses on enabling high scale, low cost, and zero-ops.

Architecture


👀 Use cases

  • Reduce SIEM costs.
  • Augment your SIEM with a security data lake for additional context during investigations.
  • Write detections-as-code using Python to detect suspicious behavior & create contextualized alerts.
  • ECS-compatible serverless alternative to ELK / Elastic Security stack.

✨ Integrations

Managed log sources

Alert destinations

Query engines

Quick start

View the complete installation instructions

Installation

Install the matano CLI to deploy Matano into your AWS account, and manage your deployment.

Linux

curl -OL https://github.com/matanolabs/matano/releases/download/nightly/matano-linux-x64.sh
chmod +x matano-linux-x64.sh
sudo ./matano-linux-x64.sh

macOS

curl -OL https://github.com/matanolabs/matano/releases/download/nightly/matano-macos-x64.sh
chmod +x matano-macos-x64.sh
sudo ./matano-macos-x64.sh

Deployment

Read the complete docs on getting started

To get started, run the matano init command.

  • Make sure you have AWS credentials in your environment (or in an AWS CLI profile).
  • The interactive CLI wizard will walk you through getting started by generating an initial Matano directory for you, initializing your AWS account, and deploying into your AWS account.
  • Initial deployment takes a few minutes.

Directory structure

Once initialized, your Matano directory is used to control & manage all resources in your project e.g. log sources, detections, and other configuration. It is structured as follows:

➜  example-matano-dir git:(main) tree
├── detections
│   └── aws_root_credentials
│       ├── detect.py
│       └── detection.yml
├── log_sources
│   ├── cloudtrail
│   │   ├── log_source.yml
│   │   └── tables
│   │       └── default.yml
│   └── zeek
│       ├── log_source.yml
│       └── tables
│           └── dns.yml
├── matano.config.yml
└── matano.context.json

When onboarding a new log source or authoring a detection, run matano deploy from anywhere in your project to deploy the changes to your account.

🔧 Log Transformation & Data Normalization

Read the complete docs on configuring custom log sources

Vector Remap Language (VRL), allows you to easily onboard custom log sources and encourages you to normalize fields according to the Elastic Common Schema (ECS) to enable enhanced pivoting and bulk search for IOCs across your security data lake.

Users can define custom VRL programs to parse and transform unstructured logs as they are being ingested through one of the supported mechanisms for a log source (e.g. S3, SQS).

VRL is an expression-oriented language designed for transforming observability data (e.g. logs) in a safe and performant manner. It features a simple syntax and a rich set of built-in functions tailored specifically to observability use cases.

Example: parsing JSON

Let's have a look at a simple example. Imagine that you're working with HTTP log events that look like this:

{
  "line": "{\"status\":200,\"srcIpAddress\":\"1.1.1.1\",\"message\":\"SUCCESS\",\"username\":\"ub40fan4life\"}"
}

You want to apply these changes to each event:

  • Parse the raw line string into JSON, and explode the fields to the top level
  • Rename srcIpAddress to the source.ip ECS field
  • Remove the username field
  • Convert the message to lowercase

Adding this VRL program to your log source as a transform step would accomplish all of that:

log_source.yml
transform: |
  . = object!(parse_json!(string!(.json.line)))
  .source.ip = del(.srcIpAddress)
  del(.username)
  .message = downcase(string!(.message))

schema:
  ecs_field_names:
    - source.ip
    - http.status

The resulting event 🎉:

{
  "message": "success",
  "status": 200,
  "source": {
    "ip": "1.1.1.1"
  }
}

📝 Writing Detections

Read the complete docs on detections

Use detections to define rules that can alert on threats in your security logs. A detection is a Python program that is invoked with data from a log source in realtime and can create an alert.

Examples

Detect failed attempts to export AWS EC2 instance in AWS CloudTrail logs.

def detect(record):
  return (
    record.deepget("event.action") == "CreateInstanceExportTask"
    and record.deepget("event.provider") == "ec2.amazonaws.com"
    and record.deepget("event.outcome") == "failure"
  )

Detect Brute Force Logins by IP across all configured log sources (e.g. Okta, AWS, GWorkspace)

detect.py
def detect(r):
    return (
        "authentication" in r.deepget("event.category", [])
        and r.deepget("event.outcome") == "failure"
    )


def title(r):
    return f"Multiple failed logins from {r.deepget('user.full_name')} - {r.deepget('source.ip')}"


def dedupe(r):
    return r.deepget("source.ip")
detection.yml
---
tables:
  - aws_cloudtrail
  - okta_system
  - o365_audit
alert:
  severity: medium
  threshold: 5
  deduplication_window_minutes: 15
  destinations:
    - slack_my_team

Detect Successful Login from never before seen IP for User

from detection import remotecache

# a cache of user -> ip[]
user_to_ips = remotecache("user_ip")

def detect(record):
    if (
      record.deepget("event.action") == "ConsoleLogin" and
      record.deepget("event.outcome") == "success"
    ):
        # A unique key on the user name
        user = record.deepget("user.name")

        existing_ips = user_to_ips[user] or []
        updated_ips = user_to_ips.add_to_string_set(
          user,
          record.deepget("source.ip")
        )

        # Alert on new IPs
        new_ips = set(updated_ips) - set(existing_ips)
        if existing_ips and new_ips:
            return True

🚨 Alerting

Read the complete docs on alerting

Alerts table

All alerts are automatically stored in a Matano table named matano_alerts. The alerts and rule matches are normalized to ECS and contain context about the original event that triggered the rule match, along with the alert and rule data.

Example Queries

Summarize alerts in the last week that are activated (exceeded the threshold)

select
  matano.alert.id as alert_id,
  matano.alert.rule.name as rule_name,
  max(matano.alert.title) as title,
  count(*) as match_count,
  min(matano.alert.first_matched_at) as first_matched_at,
  max(ts) as last_matched_at,
  array_distinct(flatten(array_agg(related.ip))) as related_ip,
  array_distinct(flatten(array_agg(related.user))) as related_user,
  array_distinct(flatten(array_agg(related.hosts))) as related_hosts,
  array_distinct(flatten(array_agg(related.hash))) as related_hash
from
  matano_alerts
where
  matano.alert.first_matched_at > (current_timestamp - interval '7' day)
  and matano.alert.activated = true
group by
  matano.alert.rule.name,
  matano.alert.id
order by
  last_matched_at desc

Delivering alerts

You can deliver alerts to external systems. You can use the alerting SNS topic to deliver alerts to Email, Slack, and other services.



A medium severity alert delivered to Slack

❤️ Community support

For general help on usage, please refer to the official documentation. For additional help, feel free to use one of these channels to ask a question:

  • Discord (Come join the family, and hang out with the team and community)
  • Forum (For deeper conversations about features, the project, or problems)
  • GitHub (Bug reports, Contributions)
  • Twitter (Get news hot off the press)

👷 Contributors

Thanks go to these wonderful people (emoji key):

Shaeq Ahmed
Shaeq Ahmed

🚧
Samrose
Samrose

🚧
Kai Herrera
Kai Herrera

💻 🤔 🚇
Ram
Ram

🐛 🤔 📓
Zach Mowrey
Zach Mowrey

🤔 🐛 📓
marcin-kwasnicki
marcin-kwasnicki

📓 🐛 🤔
Greg Rapp
Greg Rapp

🐛 🤔
Matthew X. Economou
Matthew X. Economou

🐛
Jarret Raim
Jarret Raim

🐛
Matt Franz
Matt Franz

🐛
Francesco Faenzi
Francesco Faenzi

🤔
Nishant Das Patnaik
Nishant Das Patnaik

🤔
Tim O'Guin
Tim O'Guin

🤔 🐛 💻
Francesco R.
Francesco R.

🐛
Joshua Sorenson
Joshua Sorenson

💻 📖
Chris Smith
Chris Smith

💻

This project follows the all-contributors specification. Contributions of any kind are welcome!

License

Categories:
Cybersecurity