我的搜索改造日记:从LIKE到Elasticsearch的奇妙之旅
一、一般的搜索实现
上周给做的面试刷题平台加了个搜索功能,用的是最朴素的MySQL模糊查询:
1 SELECT * FROM question WHERE title LIKE '%Java虚拟机%'
确实能搜索到,但是出现了个问题,我搜Java虚拟机 真的就只能搜出来包含“Java虚拟机”的内容,不能搜出来匹配虚拟机这样的题目,如果上线用户体验感肯定是不怎么好的(虽然可能没啥人会用我的平台==)
二、进阶一下,使用ElasticSearch进行搜索
为什么选择ES?
1. **拆词搜索**:
把"Java虚拟机"自动拆成`Java`/`虚拟`/`机`/`JVM`
2. **模糊匹配**:
搜"线程池"也能找到"多线程池优化"
3. **秒级响应**:
10万条数据测试,平均搜索耗时**0.2秒**
三、手把手改造:个人项目的极简配置
1. 本地快速启动ES
1 2 3 4 tar -zxvf elasticsearch-8.12.2-linux-x86_64.tar.gz cd elasticsearch-8.12.2./bin/elasticsearch
2. SpringBoot项目引入依赖
1 2 3 4 5 6 <dependency > <groupId > org.springframework.boot</groupId > <artifactId > spring-boot-starter-data-elasticsearch</artifactId > </dependency >
3. 配置连接(application.yml)
1 2 3 spring: elasticsearch: uris: http://localhost:9200
四、ES实体类与ElasticsearchRepository
1. 创建题目实体类
1 2 3 4 5 6 7 8 9 10 11 @Document(indexName = "my_questions") public class Question { @Id private String id; @Field(type = FieldType.Text, analyzer = "ik_smart") private String title; @Field(type = FieldType.Text, analyzer = "ik_smart") private String content; }
2. 继承ElasticsearchRepository实现类似Mybatis Plus的IService,不用自己写简单操作
1 2 3 4 5 6 7 public interface QuestionRepository extends ElasticsearchRepository <Question, String> { List<Question> findByTitleContaining (String keyword) ; } List<Question> results = questionRepository.findByTitleContaining("JVM" );
五、个性化改造
需求场景
ES把"SpringCloud"拆成`Spring`和`Cloud`,但我想把它当作整体识别
解决步骤:
1. **创建词典**
在ES安装目录的`config/analysis-ik`下新建`my_dict.dic`,内容:
修改IK配置 编辑IKAnalyzer.cfg.xml
:
1 <entry key ="ext_dict" > my_dict.dic</entry >
重启ES服务 现在搜索”SpringCloud”就不会被拆分,而是会按照自定义的词来搜索!
七、踩坑记录
坑1:数据不同步
可以写一个定时任务,进行周期性的数据同步,从MySQL同步到ES,示例代码如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 @SpringBootApplication @EnableScheduling public class ScheduleApplication { public static void main (String[] args) { SpringApplication.run(ScheduleApplication.class, args); } } @Component @Slf4j public class FullQuestionToEsJob { @Resource private QuestionEsDao questionEsDao; @Resource private QuestionService questionService; @Scheduled(fixedRate = 10000) public void copyQuestionToES (String... args) throws Exception { List<Question> list = questionService.list(); List<QuestionEsDTO> questionEsDTOList = list.stream().map(question -> { return QuestionEsDTO.objToDto(question); }).collect(Collectors.toList()); int pageSize = 500 ; int total = questionEsDTOList.size(); log.info("FullCopyToEs start , total {}" ,total); for (int i = 0 ; i < total; i += pageSize) { int end = Math.min(i + pageSize, questionEsDTOList.size()); log.info("Copy from {} to {}" ,i,end); List<QuestionEsDTO> subList = questionEsDTOList.subList(i, end); questionEsDao.saveAll(subList); } log.info("FullCopyToEs end , total {}" ,total); } }
坑2:中文分词失效
检查两处配置:
确保字段使用analyzer = "ik_smart"
确认ES安装了IK插件(下载地址:elasticsearch-analysis-ik )