Big Data 186 - Logstash JDBC vs Syslog Input

1. JDBC Input

1.1 Principle

Connects to MySQL, PostgreSQL and other relational databases through JDBC driver, executes SQL queries to get data.

1.2 Key Config

input {
  jdbc {
    jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
    jdbc_user => "root"
    jdbc_password => "password"
    jdbc_driver_library => "/path/to/mysql-connector-java.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    statement => "SELECT * FROM orders WHERE create_time > :sql_last_value"
    tracking_column => "create_time"
    last_run_metadata_path => "/path/to/last_run"
  }
}

1.3 Incremental Sync Key

ParameterDescription
sql_last_valueLast query’s maximum value
tracking_columnColumn to track (timestamp/auto-increment ID)
last_run_metadata_pathSave last run state

1.4 Applicable Scenarios

  • Data import from database to Elasticsearch
  • ETL data sync
  • Scheduled batch sync

2. Syslog Input

2.1 Principle

Listens on syslog port (default 514), receives system logs.

2.2 Config

input {
  syslog {
    port => 514
    codec => plain {
      format => "%{message}"
    }
  }
}

2.3 Applicable Scenarios

  • System log collection
  • Network device logs
  • Real-time log stream

3. Comparison

DimensionJDBC InputSyslog Input
Data SourceRelational databaseReal-time log stream
Sync MethodScheduled batchReal-time push
State Managementsql_last_valueStateless
Applicable ScenariosETL/offline syncReal-time monitoring

4. Summary

  • JDBC: Suitable for database sync, batch processing
  • Syslog: Suitable for real-time log collection
  • Choose appropriate Input plugin based on business scenario