使用logstash-input-jdbc插件同步mysql数据到elasticsearch,系统会使用一个默认的动态映射模板,模板名字为logstash。在启动logstash过程中你会看到如下信息

Using mapping template from {:path=>nil}

Attempting to install template{:manage_template=>{"template"=>"logstash-*","version"=>50001,"settings"=>{"index.refresh_interval"=>"5s"},"mappings"=>{"_default_"=>{"_all"=>{"enabled"=>true,"norms"=>false},"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message","match_mapping_type"=>"string", "mapping"=>{"type"=>"text","norms"=>false}}},{"string_fields"=>{"match"=>"*","match_mapping_type"=>"string","mapping"=>{"type"=>"text","norms"=>false,"fields"=>{"keyword"=>{"type"=>"keyword"}}}}}],"properties"=>{"@timestamp"=>{"type"=>"date","include_in_all"=>false},"@version"=>{"type"=>"keyword","include_in_all"=>false},"geoip"=>{"dynamic"=>true,"properties"=>{"ip"=>{"type"=>"ip"},"location"=>{"type"=>"geo_point"},"latitude"=>{"type"=>"half_float"},"longitude"=>{"type"=>"half_float"}}}}}}}}

Installing elasticsearch template to_template/logstash

你看第一行path=>nil表示没有找到自定义模板,那就使用默认模板,并且最后将模板存储在elasticsearch模板路径中,以logstash命名。模板内容:


{

"template":"logstash-*",

"version": 50001,

"settings": {

"index.refresh_interval":"5s"

},

"mappings": {

"_default_": {

"_all": {

"enabled": true,

"norms": false

},

"dynamic_templates": [

{

"message_field":{

"path_match":"message",

"match_mapping_type": "string",

"mapping": {

"type":"text",

"norms":false

}

}

},

{

"string_fields":{

"match": "*",

"match_mapping_type": "string",

"mapping": {

"type":"text",

"norms":false,

"fields":{

"keyword": {

"type": "keyword"

}

}

}

}

}

],

"properties": {

"@timestamp": {

"type":"date",

"include_in_all":false

},

"@version": {

"type":"keyword",

"include_in_all":false

},

"geoip": {

"dynamic": true,

"properties": {

"ip": {

"type":"ip"

},

"location": {

"type":"geo_point"

},

"latitude": {

"type":"half_float"

},

"longitude":{

"type":"half_float"

}

}

}

}

}

}

}

他会帮我们自动映射同步过来的字段,但是有一个不好的地方是大部分text类型都分词,而我自己的需求更多是不分词,所以要自定义映射;刚开始我没意识到模板的优先级,我是没改模板配置,一切都是默认,只不过在启动logstash之前,我先用curl -XPUT命令在es集群上创建了不分词的映射,但是发现同步完数据后并没生效,这才意识到logstash的output插件优先级高于你在集群上创建的映射。所以接下来修改模板并覆盖默认的。

首先用命令删除默认的模板:

curl –XDELETE–u elastic ‘192.168.11.31:8011/_template/logstash’

然后新建一个文件es-template.json,名字随便起,在默认模版的内容上修改一下,粘进去,我这里将text类型全部不分词,一下是模板内容

{

"template": "my_index",

"settings" : {

"index.refresh_interval" :"5s"

},

"mappings" : {

"_default_" : {

"_all" : {"enabled":false, "omit_norms" : true},

"dynamic_templates" : [ {

"message_field" : {

"match" :"message",

"match_mapping_type" :"string",

"mapping" : {

"type" :"string", "index" : "not_analyzed","omit_norms" : true,

"fielddata" : {"format" : "disabled" }

}

}

}, {

"string_fields" : {

"match" :"*",

"match_mapping_type" :"string",

"mapping" : {

"type" :"string", "index" : "not_analyzed","omit_norms" : true,

"fielddata" : {"format" : "disabled" },

"fields" : {

"raw" :{"type": "string", "index" :"not_analyzed", "ignore_above" : 256}

}

}

}

} ]

}

}

}


然后是logstash的启动文件jdbc.conf里面output模块配置:

if[type] =="my_type"{

elasticsearch {

hosts => ["192.168.110.31:8011","192.168.110.31:8012","192.168.110.31:8013"]

user => "elastic"

password => "abc123qwer"

index => "my_index"

document_id => "%{id}"

#manage_template =>"false"

template =>"/home/lvyuan/elasticsearch/logstash-5.5.3/template/es-template.json"

template_name =>"my_index"

template_overwrite =>"true"

}

}


启动前先删除以前创建的索引和模板,启动后发现没生效的话,一定要先删除索引和模板(是存储到_template下的模板,不是这个模板物理文件),然后再修改再运行看看。

curl -XDELETE-u elastic 'http://192.168.110.31:8011/_template/my_index'

curl -XDELETE-u elastic 'http://192.168.110.31:8011/my_index'


因为我的初衷是elasticsearch替代mysql的sql语句查询,并不想全文搜索,所以分词还可能影响我的功能,例如有一个字段在mysql中是存储一段既有大写有小写间杂的字母序列(eg:HTZG5jjhffdwe),当采用默认的映射模板(会分词)时,会将这个字母序列先全部转为小写再存入token中,这样的话,用termQuery(不分词,精确匹配)肯定找不到,有人会说可以用matchPhraseQuery,这个的确可以查到;但是如果我想用前缀匹配时prefixQuery(不分词)就查不到了,用以小写的“htzg5”开头的前缀可以匹配到,但是用大写的就匹配不到,token表里全是小写的,肯定匹配不到。所以说具体情况具体分析,不是所有情况下都应该分词。你可以试一下:

http://localhost:8011/_analyze?pretty&analyzer=standard&text=HTZG5jjhffdEX7w52r37880 全转为小写存入token

:{"tokens":[{"token":"htzg5jjhffdex7w52r37880","start_offset":0,"end_offset":22,"type":"<ALPHANUM>","position":0}]}


参考地址:http://blog.csdn.net/asia_kobe/article/details/51192848

http://www.cnblogs.com/NextNight/p/6860283.html

http://www.cnblogs.com/cocowool/p/elk_dynamic_templates.html

https://elasticsearch.cn/article/21

http://m.blog.csdn.net/u012516166/article/details/75106184