how to trouble shooting http/tcp request

Curl

根据curl分析http请求时间

简单方式可以直接加个通用的time命令:

[root@~]# time curl www.baidu.com
<!DOCTYPE html>
<!--STATUS OK--><html> <head><meta http-equiv=content-type content=text/html;charset=utf-8><meta http-equiv=X-UA-Compatible content=IE=Edge><meta content=always name=referrer><link rel=stylesheet type=text/css href=http://s1.bdstatic.com/r/www/cache/bdorz/baidu.min.css><title>百度一下,你就知道</title></head> <body link=#0000cc> <div id=wrapper> <div id=head> <div class=head_wrapper> <div class=s_form> <div class=s_form_wrapper> <div id=lg> <img hidefocus=true src=//www.baidu.com/img/bd_logo1.png width=270 height=129> </div> <form id=form name=f action=//www.baidu.com/s class=fm> <input type=hidden name=bdorz_come value=1> <input type=hidden name=ie value=utf-8> <input type=hidden name=f value=8> <input type=hidden name=rsv_bp value=1> <input type=hidden name=rsv_idx value=1> <input type=hidden name=tn value=baidu><span class="bg s_ipt_wr"><input id=kw name=wd class=s_ipt value maxlength=255 autocomplete=off autofocus></span><span class="bg s_btn_wr"><input type=submit id=su value=百度一下 class="bg s_btn"></span> </form> </div> </div> <div id=u1> <a href=http://news.baidu.com name=tj_trnews class=mnav>新闻</a> <a href=http://www.hao123.com name=tj_trhao123 class=mnav>hao123</a> <a href=http://map.baidu.com name=tj_trmap class=mnav>地图</a> <a href=http://v.baidu.com name=tj_trvideo class=mnav>视频</a> <a href=http://tieba.baidu.com name=tj_trtieba class=mnav>贴吧</a> <noscript> <a href=http://www.baidu.com/bdorz/login.gif?login&amp;tpl=mn&amp;u=http%3A%2F%2Fwww.baidu.com%2f%3fbdorz_come%3d1 name=tj_login class=lb>登录</a> </noscript> <script>document.write('<a href="http://www.baidu.com/bdorz/login.gif?login&tpl=mn&u='+ encodeURIComponent(window.location.href+ (window.location.search === "" ? "?" : "&")+ "bdorz_come=1")+ '" name="tj_login" class="lb">登录</a>');</script> <a href=//www.baidu.com/more/ name=tj_briicon class=bri style="display: block;">更多产品</a> </div> </div> </div> <div id=ftCon> <div id=ftConw> <p id=lh> <a href=http://home.baidu.com>关于百度</a> <a href=http://ir.baidu.com>About Baidu</a> </p> <p id=cp>&copy;2017&nbsp;Baidu&nbsp;<a href=http://www.baidu.com/duty/>使用百度前必读</a>&nbsp; <a href=http://jianyi.baidu.com/ class=cp-feedback>意见反馈</a>&nbsp;京ICP证030173号&nbsp; <img src=//www.baidu.com/img/gs.gif> </p> </div> </div> </div> </body> </html>

real	0m0.365s
user	0m0.004s
sys	0m0.006s

或者更详细的方式:

1 Create one file such as curl-format.txt

[root@001 jiafu]# cat curl-format.txt 
 time_namelookup:  %{time_namelookup}\n
       time_connect:  %{time_connect}\n
    time_appconnect:  %{time_appconnect}\n
   time_pretransfer:  %{time_pretransfer}\n
      time_redirect:  %{time_redirect}\n
 time_starttransfer:  %{time_starttransfer}\n
                    ----------\n
         time_total:  %{time_total}\n

2 Send Curl command with file with format

[root@001 jiafu]#  curl -w "@curl-format.txt" -o /dev/null -s https://www.baidu.com  
 time_namelookup:  0.001
       time_connect:  0.044
    time_appconnect:  0.295
   time_pretransfer:  0.296
      time_redirect:  0.000
time_starttransfer:  0.345
                    ----------
         time_total:  0.345

3 Analyst the result

time_appconnect The time, in seconds, it took from the start until the SSL/SSH/etc connect/handshake to the remote host was completed. (Added in 7.19.0)

time_connect The time, in seconds, it took from the start until the TCP connect to the remote host (or proxy) was completed.

time_namelookup The time, in seconds, it took from the start until the name resolving was completed.

time_pretransfer The time, in seconds, it took from the start until the file transfer was just about to begin. This includes all pre-transfer commands and negotiations that are specific to the particular protocol(s) involved.

time_redirect The time, in seconds, it took for all redirection steps include name lookup, connect, pretransfer and transfer before the final transaction was started. time_redirect shows the complete execution time for multiple redirections. (Added in 7.12.3)

time_starttransfer The time, in seconds, it took from the start until the first byte was just about to be transferred. This includes time_pretransfer and also the time the server needed to calculate the result.

time_total The total time, in seconds, that the full operation lasted.

tcpDump

根据tcpDump分析http请求时间

如下图:

附上https过程:

traceroute

[sbalabrahman@host ~]$ traceroute www.baidu.com
traceroute to www.baidu.com (145.20.193.63), 30 hops max, 60 byte packets
1  sjc02-wxp00-csw02-vl4085.xxx.com (10.255.29.5)  2.085 ms  2.156 ms  2.289 ms
2  sjc02-wxp00-srt01-vl4000.yyy.com (10.255.3.242)  0.930 ms  1.084 ms  0.878 ms

建议在linux下加-T参数: -T Use TCP SYN for probes
否则经常***,30跳都完成不了。

lsof

根据lsof可以获取tcp连接建立了多久。

1 查询连接端口对用的连接:
[root@test~]# netstat -nap|grep -i 9001
tcp 0 0 10.120.71.15:53323 10.240.168.71:9001 ESTABLISHED 9742/application
2 lsof进程号,更根据本地端口过滤
[root@test~]# lsof -p 9742 | grep 53323
application 9742 root 30u IPv4 175283944 0t0 TCP www.xxx.yyy :etlservicemgr (ESTABLISHED)
3 查询句柄
[root@test~]# ll /proc/9742/fd/30
lrwx------ 1 root nobody 64 Jul 11 06:19 /proc/9742/fd/30 -> socket:[175283944]

BTW: refer to https://unix.stackexchange.com/questions/55924/timestamp-of-socket-in-proc-pid-fd
实际上这个不定是这个connection的创建连接,实际观察也是如此。但是从看到后,变不变还是能反映问题,但是从另外一个角度看,知道建立多久的意义已经不大。
可以直接上来先ls所有fd做个比较。

jvm startup parameters for trouble shooting

总是忘,然后每次都是现查现用,所以汇总记录下吧:

1.enable heapdump:

-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp/heapdump.hprof -XX:OnOutOfMemoryError =”sh ~/cleanup.sh”

2.enable remote debug:

-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=4000,suspend=n

3.enable GC log:
-Xloggc:/logs/`date +%F_%H-%M-%S`-gc.log -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCCause
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M

4.enable JMX:
-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=8091 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.password.file=/conf/jmxremote.password -Dcom.sun.management.jmxremote.access.file=/conf/jmxremote.access

5.enable Jprofiler:
-agentpath:/opt/jprofiler/bin/linux-x64/libjprofilerti.so=port=8849

6.enable EMMA:
-Demma.rt.control=true -Xverify:none -Demma.coverage.out.file=/opt/codecoverage_emma.ec