Monitoring Internet performance and measuring user quality of experience are drawing increased attention from both research and industry. To match this interest, large-scale measurement infrastructures have been constructed. We believe that this effort must be combined with a critical review and calibrarion of the tools being used to measure performance.In this paper, we analyze the suitability of ping for delay measurement. By performing several experiments on different source and destination pairs, we found cases in which ping gave very poor estimates of delay and jitter as they might be experienced by an application. In those cases, delay was heavily dependent on the flow identifier, even if only one IP path was used. For accurate delay measurement we propose to replace the ping tool with an adaptation of paris-traceroute which supports delay and jitter estimation, without being biased by per-flow network load balancing.