将cdn加速域名开启日志转存,s3新建日志库,注意日志不能直接放在根目录
日志原文
Mozilla/5.0%2520(Windows%2520NT%25206.1;%2520Win64;%2520x64)%2520AppleWebKit/537.36%2520(KHTML,%2520like%2520Gecko)%2520Chrome/70.0.3538.110%2520Safari/537.36 - - Hit 7OEg8GDxuOsuwxzJ4tUgwP8no4foFkBM0QAzfshh4ObilaBcOTbmSA== d1lnephkr7mkjn.cloudfront.net https 106 0.001 - TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Hit HTTP/2.0 - -
2018-11-27 06:58:20 TPE52-C1 222 39.8.77.136 GET d1lnephkr7mkjn.cloudfront.net /ueditor/image/20171019/1508385777747154.png 304 http://test.looaon.com/20fbg
建表语句,根据日志格式和日志字段来建表
CREATE EXTERNAL TABLE IF NOT EXISTS elb_logs_orc (
request_timestamp string,
elb_name string,
request_ip string,
request_port int,
backend_ip string,
backend_port int,
request_processing_time double,
backend_processing_time double,
client_response_time double,
elb_response_code string,
backend_response_code string,
received_bytes bigint,
sent_bytes bigint,
request_verb string,
url string,
protocol string,
user_agent string,
ssl_cipher string,
ssl_protocol string )
STORED AS ORC
LOCATION 's3://athena-examples-ap-southeast-1/elb/orc/year=2015/month=01/'
tblproperties ("orc.compress"="ZLIB");
日志格式解释参考
https://docs.aws.amazon.com/zh_cn/AmazonCloudFront/latest/DeveloperGuide/AccessLogs.html#BasicDistributionFileFormat
I was able to find good advice from your blog articles.