VISUALIZATION ANALYSIS OF TRAFFIC CONGESTION BASED ON FLOATING CAR DATA

. Traﬃc congestion visualization is an important part in traﬃc information service. However, the real-time data is diﬃcult to obtain and its analysis method is not accurate, so the reliability of congestion state visualization is low. This paper proposes a visualization analysis algorithm of traﬃc congestion based on Floating Car Data (FCD), which utilizes the FCD to estimate and display dynamic traﬃc state on the electronic map. Firstly, an improved map matching method is put forward to match rapidly the FCD with road sections, which includes two steps of coarse and precise matching. Then, the traﬃc speed is estimated and classiﬁed to display diﬀerent traﬃc states. Eventually, multi-group experiments have been conducted based on more than 8000 taxies in Xian. The experimental results show that FCD can be matched accurately with the selected road sections which accuracy can reach up to 96%, and the estimated traﬃc real-time state can achieve 94% in terms of reliability. So this visualization analysis algorithm can display accurately road traﬃc state in real time.


1.
Introduction. Traffic congestion in urban road is a very serious problem that most cities are facing with it, which has great influences on people's life, property and air quality. A convenient and accurate real-time traffic information display method is very necessary for efficient traffic.
To monitor traffic conditions or collect traffic volume data, several traditional methods such as remote sensing are widely used. The existing collection methods of road traffic flow mainly adopt the fixed-point modes including induction loop, microwave, radar and video techniques, which have some drawbacks such as limited coverage region, high cost, and difficult to maintenance [4]. FCD (Floating Car Data) has the advantages of easy maintenance, low cost, comprehensive information, so FCD has received significant attention in the past decades [10]. Guo used the FCD to get the real time traffic flow [6]. Yuan designed a scheme to detect the traffic congestion and dissemination based on FCD [16]. However, they usually focused on the estimation of link travel time, and link travel speed, and seldom paid attention to the visualization of traffic congestion. Map matching algorithms integrate positioning data with spatial road network data to identify the correct link on which vehicle is moving and determine the location of a vehicle on a link. Inaccurate and low-frequency positioning data has two challenges in the study of Map-matching. Bierlaire developed a post-processing map-matching algorithm for GPS-enabled smart phones only for driving [1]. Position, speed and heading data from GPS are used to assign a probability to each candidate path based on horizontal accuracy of GPS data and the road network. Miwa [11] and Lou [9] developed a similar algorithm specifically for vehicles with lower positioning data frequency and only position coordinates and timestamp. Sparse positioning data, with gaps up to 1.5, 3 and 5 min with latitude, longitude and timestamp, were used. Chen proposed FCD map-matching algorithm based on local path searching [4]. The information of the previous matched GPS point is utilized to reduce the search space significantly. Chen et al. proposed a mapmatching algorithm for large-scale low-frequency floating car data [3]. They used a multi-criteria dynamic programming technique to minimize the number of candidate routes for each GPS point. Lou et al. [9] considered two important points in their post-processing algorithm: (a) true paths tend to be straight, rather than roundabout and (b) true paths tend to follow posted speed limits on roads. Yang et al. [15] proposed to project GPS points within 20 m from intersection nodes onto the intersection node itself, postponing resolution of the problem to the next point. Quddus [12] developed an algorithm which was used together with the outputs of an extended kalman filter formulation for the integration of GPS, dead reckoning data, and a spatial digital database of the road network, to provide continuous, accurate and reliable vehicle location on a given road segment.
This paper proposes a dynamic traffic visualization algorithm based on FCD. The proposed algorithm is composed with map matching, velocity estimation and data visualization. It has fast speed, high reliability, and more conducive to meet the people's daily travel traffic demand. The remaining part of this paper is organized as follows: the improved map matching algorithm based on a weighted topology and trajectory shape is introduced in the next section; the section 3 puts forward a compensation method of the road speed estimating; the section 4 describes a traffic congestion data visualization method; some experiments using real data are conducted and analyses are shown in section 5. Finally, the conclusions are summarized in the last section.
2.1. Data pre-processing. In this paper, the FCD data is acquired from GPS of 8000 Xi'an taxies. The data sampling interval is 30 seconds. The original format of received data includes six kinds of information, which are respectively the plate number, time, longitude, latitude, speed and direction, as shown in Table 1.
FCD data cannot be directly used for the actual ITS, which reasons are shown as follows: GPS coordinate errors: the prevailing error, 3-15 m, exists in most of the commercial GPS, which causes the vehicles GPS coordinates and road position deviation.
The attribute information is insufficient: GPS data has a latitude and longitude, but we need to associate them and specific network number in the dynamic navigation.
Because of equipment quality and transmission, there are errors in the original data. If the map matching method is directly used for them, it has a great influence for efficiency of the algorithm, so the study cleans original data by preprocessing, which main idea is to eliminate data that speed is less than 0 km/h or greater than 120 km/h, latitude and longitude are out of range, and driving direction is abnormal.

Coarse matching.
There are four weights in this map matching algorithm: heading, proximity, link connectivity and turn restriction [14] . We can get the heading weight from the vehicle movement direction and link direction [5]. It is calculated by (1).
The weight for proximity is based on the perpendicular distance (D) from the GPS point to the link. Generally speaking, the error in a GPS measurement can be reasonably described as a normal distribution N (µ, σ), so the f (D) is define as [7], where d j is the distance between the correct point to the candidate point. We can use a zero-mean normal distribution with a standard deviation of 20 m based on empirical evaluation.
When the vehicle is at a junction, the link connectivity and turn restriction will be considered necessarily [14]. They can be defined by Setting the T W S as total weight score, it is the sum of four weights for a link at a junction, as given in (5).
Where,H w ,D w ,C w and T w are the weight coefficients of heading, proximity, link connectivity and turn restriction respectively. These coefficients represent the relative importance of different factors in calculating T W S.
According to this method above, the candidate point set of a FCD can be obtained. If the difference between the maximum probability and the second of candidate points is less than a threshold, the first two candidate points are retained; on the contrary, we only get the one with maximum probability.

2.2.2.
Precise matching. The possibility described in Fig. 1 may be produced after the coarse matching above, which four points are matched to the wrong road section, namely the 5 th , 6 th , 7 th and 8 th points. In order to further improve accuracy and reliability of the matching results, this paper makes full use of the result of the coarse matching, and determines the unique position estimated point according to the overall moving trends within a certain period of time. The method connects the FCD to obtain a trajectory, and then it is matched with the line produced by the estimated point on the real road to make sure the movement trajectory unique. That is, the position estimated point is the unique existence.
Assuming that vehicle location data D 1 , D 2 , · · · , D k are obtained by GPS in t = 1, 2, · · · , k and k is less than the update time of road conditions, the corresponding candidate points E 1 , E 2 , · · · , E k in road network are obtained by the coarse matching algorithm. Each candidate point of the FCD may not be unique, so it can be optimized as follows: Step1. All candidate sections corresponding to the set E = {E 1 , E 2 , · · · , E k } are found to build a road network structure, as shown in Fig. 1. In the figure, the arrow refers to the direction of a passable road; the blue circle is original location data collected by floating car; the red circle represents candidate point after matching road section, in which the 5 th , 6 th , 7 th and 8 th points have two red points; the green line is the candidate road section; the yellow line is the line L to be matched that is got by connecting D i , i = 1, 2, · · · , 9 in turn. Step2. On the basis of assuming the correct initial and final positions, according to the candidate points, all candidate routes can be determined, such as L 1 =1-2-3-5-7, L 2 =1-2-4-6-7. If the candidate route is only one, it is considered directly to be the best matching route.
Step3. L 1 ,L 2 and L are discretized by using the interval d, and Freeman chain (8 directions) is applied to code those discrete points, respectively obtaining feature vectors C 1 , C 2 and C L .
Step4. The feature vector C L is matched to C 1 and C 2 in turn, to find the best matching route L max =L 1 , so it is considered to the movement track of vehicle in this time. The similarity measurement standard in this paper is the Hausdorff distance (Max-min Distance), which means the dissimilar degree between two point sets. Setting C L = {c L1 , c L2 , · · · , c Lp } and C k = {c k1 , c k2 , · · · , c kq }, the Hausdorff distance H between C L and C k is defined by • is defined as the distance norm in points sets of C L and C k .
Step5. Candidate points without the best matching route L max are eliminated to obtain the accurate and unique result. Through the above method, error candidate points can be found and eliminated, to get more accurate result. It is of great significance for real-time traffic state estimation.
3. Traffic speed estimation. Assume the GPS data sequence covering a section of the road as P 0 → P k , the positioning time sequence as t 0 → t k , the instantaneous speed as v 0 → v k , the distance can be calculated by (7) during the time t 0 → t k [15] [16].
Where,i = 1, 2, · · · , k − 1, k . With the fixed sampling interval GPS data, equation (7) can be simplified as: The vehicle speed in the road is The current road speed is estimated with the instantaneous velocity and average velocity, which error can be reduced effectively. So V is calculated by Where,b 1 and b 2 are correlation coefficients of instantaneous velocity and average velocity respectively. We set b 1 =0.6, b 2 =0.4 according to the experimental data.
4. Visualization analysis. In order to make the visualization result more hierarchical and distinguishable, we need to conduct pseudo color processing on result. In practical application, HIS transform, rainbow coding and hot metal coding are the main methods. We prefer to use the HIS transform which is easy to realize and has better visual effect.

HIS transform.
The HIS space is a frequently-used perceptual color space. The symbol H represents the hue, which is used to distinguish different color features. The symbol of I represents the intensity of image. The saturation is represented by the symbol S, which reflects the concentration of color [2].
Firstly, the value of intensity I, hue H, saturation S can be gotten through brightness transform, hue transform and saturation transform. Then the value of RGB is obtained by HIS transform to generate a new image. The procedure of processing is shown in Fig. 2. The transform relationship between the HIS space and RGB space is expressed in (11)- (13).
4.2. Visualization procedure. The Fig. 3 shows the various results got by different pseudo color processing methods [7] [19]. The results of rainbow coding and hot metal coding have too many colors, which are not needed. But the result of HIS transform has enough kinds of color, and all of them can be used for congestion visualization. Hence, in order to avoid the loss or redundancy of information, HIS transform is used to process pseudo color, and five kinds of color are chosen in our paper. Visualization procedure (seeing Fig. 4) is described specifically as follows: Step1. Divide the city map into 1024*1024 images.
Step2. Initialize speed value of every area.
Step3. According to the update frequency (5min), data during this period can be acquired.
Step4. Based on this divided result, we estimate speed of every sub-area using section 3.
Step5. The speed value of every sub-area can be converted as gray value with [0, 255]. As our research object is city roads, we set that the up line of speed is 80 km/h and the low line is 0 km/h. Equation (14) shows the conversion process. Step6. Using the HIS transform, we can get the color image by combining the RGB result of every sub-area.
Step7. Add the traffic congestion layer in the map to display for users.
Step8. Update the map every five minutes.   traditional distance map matching and the study is shown in the Fig. 5. It shows that the matching efficiency and accuracy are improved in the paper. After a large number of comparative experiments, some detailed parameters are summarized in Table 2. After map matching, based on the current road speed (section 3), we use five colors to express the traffic congestion of urban road by HIS transform, which are shown in Fig. 6. The relationship of speed and them is listed in Table 3, which is divided by our experience. Based on the platform and the algorithm above, we achieve congestion visualization, and the result of March 12, 2015 PM 7 is shown in Fig. 7. Through the actual investigation in a large number of sections, we find its reliability is high, and the accuracy can be more than 94%. 6. Conclusions. Traffic congestion visualization analysis algorithm based on FCD is proposed in the paper, which can analyze accurately real-time road traffic condition to display. It has better anti-noise performance for the GPS data, and the reliability and real-time performance are superior to the traditional algorithm; The FCD can be accurately matched by an improved topology map matching algorithm, which accuracy rate is up to 96%, and is obviously higher than the traditional distance map matching algorithm;The current road speed is estimated by instantaneous velocity and average velocity, to describe congestion states. Then, after HIS transform, they are visualized in the map by five kinds of colors. So this visualization method is more clear; With the high reliability and fault tolerance, the algorithm proposed in this paper can meet the basic requirements of people ′ s daily travel.