See also an update till 2012 and a supplement of 2015

**
Weather at TRAO since 2000
****
**

TRAO
(Torun Radio Astronomy Observatory or Department of Radio Astronomy
of the TCfA) meteo data are collected since 2000 from
the weather
station WST7000 of IRDAM SA.
It is mounted atop a pole sticking 4.2 m above the roof of adjacent building.
** Introduction**
**WTS700 closeup (top) and the NEE view on its surroundings
(bottom)**(Photos courtesy of J. Mazurek, 9 April 2008)

The station outputs data at the rate of 10 messages per second. There are all
the five meteo quantities in each message.
The accuracies (RMS) specified by the manufacturer are as follows (for more
details consult the *Technical Manual*; compare also a complete manufacturer's
specification sheet in pdf format):

±5° for the wind direction,

±1 °C for the air temperature,

±3 % (RH at 20 % to 90 %) to ±4 % (elsewhere) for the relative humidity, and

±1 hPa (at 23°C) to ±3 hPa (at -40°C to 60°C) for the atmospheric pressure.

The last number in these records is meant for an optional external sensor (which is absent in our installation).DoY Test V D T P RH Td2.000093 0 0xff 4.19 103.42 -0.68 1018.31 100.00 -0.68 0.056 2.000208 0 0xff 3.88 97.41 -0.62 1018.34 100.00 -0.62 0.055

A word of warning for a potential user: some daily files have as their first record the one that belongs to the end of the previous day file (but is not present there).

Measurement results of current day are displayed online on TRAO website.

In this document we present results of some simple statistical analyses of
the entire data set, since the time of installation of the meteo station
in mid 2000 until March 10, 2008.
These results were obtained after removing evidently erratic records.
A record has been considered erratic if any of its meteo data did not fit
into specified range. The ranges were set somewhat arbitrarily (but with care)
as follows:

0 to 65 m/s for wind speed,

-40 to 40 °C for air temperature,

956 to 1040 hPa for atmospheric pressure, and

0 to 100 % for humidity.

There were only 75 (out of over 22 000 000 analysed) so defined erratic records.
All these were outliers evident by having values clearly removed from those
in surrounding records. This, however, means there almost certainly are
spoiled records with values within the ranges specified for 'good' data.
Also statistics (distributions presented in this report) seem to indicate
excess of data in some places of the ranges.

Below presented are daily high and low values of the four main quantities. Avalaible is also a tabular form of the data plotted. These diagrams consist of individual lines drawn between the minimum and maximum value of given quantity as found separately for each day.

**
**

**
**

The plots that follow have these widths of bins (horizontal resolutions):
Each of the above record values has been checked if it represents a gradually
reached extremum and is not a fake value due to corrupted reading.
This was done by visual inspection of neighbouring measurements in
the respective daily file.
In view of suspect quality of some of our measurements we have compared them
with data obtained in a nearby (about 8.5 km away) professional
meteo station
located in Koniczynka village. The station is supervised by dr Marek Kejna of
Department of Climatology,
Institute of Geography, who kindly made his data available to us
(a big thank you to him, and also to Zsuzsa Vizi for help in
data format conversion).
The Koniczynka data taken for comparison were 1-hourly averages in the period
from 1 January 2003 to 31 December 2007, Central European Time. There were
39110 (out of 43824 possible) hours in which
at least one of the five quantities was measured in both the stations.
These were plotted against each other as shown in the figures that follow.
Overlapped points are plotted horizontally offset. The red lines represent
an ideal case of perfect correlation of respective measurements in both
the stations whereas the olive lines correspond to a least squares fit
(see the table further down this page).
The following table contains numerical results of linear fits. The
best fitted lines are of the form P =
One notes very good correlation of the air temperature and atmospheric
pressure, and high dispersion of the wind quantities. The Piwnice pressure
is on average lower than that in Koniczynka only by 0.6 hPa
at 960 hPa and by 0.2 hPa at 1040 hPa, while the temperature is higher
by 0.9 to 0.6°C at -25 to +35°C, respectively.
In case of the wind direction, the decorrelation may partly be explained
by inappropriate averaging of data originally read from the station. During
this initial data reduction the angle mean is calculated the same way as
the means of the other quantities.
A proper algorithm should rely on calculating the mean sine and the mean
cosine of the angles being averaged, where the summations are carried over all N measurements. That is how
our wind direction data were further averaged to obtain the 1-hour means
for this particular plot. Judging after the depth and width of the gap near
the direction of 0° in the angular distribution (see the figure on the left,
which is circularly rearranged and expanded display of a part of the earlier
presented distribution of wind directions) it is possible to estimate the number
of measurements swept away from there. This number makes about 1 %
of all measurements. That many data were affected by the simple
(inappropriate) averaging of angles near 0 and 360°. The data that originally
belonged to the depression at 0° must have been spread over the entire 360°
range, with maximum at 180°.
The analysis of raw data presented above has indicated the presence
of corrupt recordings. To improve usability of this database an attempt
has been made to clean it of more obvious erratic records and
tag some errors, essentially only with respect to the atmospheric pressure
and temperature measurements. Basic search for errors relied on
comparison of the deviation of each measurement (a 10-second
average as usually stored in the archives) from the mean value
of the temperature and pressure with the standard deviation
calculated for various time intervals. The intervals ranged from 4 minutes
to 1 hour. If a value deviated from the mean by a few standard
deviations it was further compared to neighbouring measurements
and automatically tagged as erratic only if there were 'normal'
neighbours on both sides. There were cases that two consecutive
measurements happened erratic, and these were treated individually.
This way we have detected a few hundred errors, most of them
belonged to the pressure measurements. Another cleaning has been
based on a search for exactly the same numerical values repeated
in number (10 or more) consecutive measurements. It allowed to remove
many cases when apparently all the sensors simultaneously 'froze' for
a few minutes. Such records were not tagged but were altogether
erased from the daily files.
Finally, a search for incorrect time tags was performed. There
are many cases (some 30 000) that two neighbouring records have
the same time stamp (whereas they are expected to differ by 10 s).
About 2700 cases were discovered where time of the next record was
earlier than of the current one, and in 8 files there are backward jumps
in time exceeding 10 minutes and reaching 1 hour (in one case there
is a 2-hour jump!). Unfortunately, only one of the latter cases could
be corrected by shifting in time a portion of earlier data. It seems
that the backward jumps are due to fast computer (internal system)
clock.
The corrected database encompassing measurements till 1 June 2008 (inclusive)
now consists
of 22797080 healthy records (lines) and 347 records tagged as
having erratic pressure or temperature measurement. The summary statistics
do not differ by more than 0.1 from those already presented in this report
and calculated prior to the correction
with sole exception of the median relative humidity, which now is
equal to 99.1 %.
This database has been reduced to 1-hour averages
and is avalable for download in the form of one
zipped file. This big file (uncompressed it is
about 4 MB in size) contains an ASCII table, header plus 65639 data raws,
which begins thus:
Despite considerable amount of cleaning and corrections, statistical
properties of our data base did not change much, so that the results
presented for original data remain valid. This refers also to the one-hour
data (see this analysis and figures) and to
the correlations with the Koniczynka data, which are now only slightly
better. For example, the most affected quantity of the atmospheric
pressure now is linearly related to
the Koniczynka data through this equation:
P
** Distributions of measured values**

0.1 m/s for the wind speed,

2 arc degrees for the wind direction,

0.5 °C for the air temperature,

0.5 hPa for the atmospheric pressure, and

0.5 % for the relative humidity.

The numbers shown (plus the few outside the figure frames) in each of
the five plots sum up to 22114598, i.e, the number of records accepted
as 'good'.

The wind distributions exhibit two peculiarities. One is a 'bump',
a secondary mode at about 1 m/s in the velocity distribution,
and the other is a wavy pattern in the angular distribution. To check
if the two features come from the same source we have inspected
the angular distrubution of winds belonging to the bump, i.e.
with speeds smaller than 1.3 m/s. Such a distribution does not
contain the wavy patters. In fact it is only marginally visible
in winds slower than the modal value of 3.2 m/s (see the second
diagram below), so responsible seem to be the stronger winds.
Noteworthy, this wavy structure in the angular
distribution is present in about equal measure in older data and
more recent data, thus it is not due to any aging effect.
One of possible sources could be the three vertical rods placed
around the station (meant to safeguard it against lightning
discharges). The rods seen from the wind point of view form perfectly
symmetrical structure with respect to the station (thus should not
distort the wind direction) every 30 degrees. Between these 12
directions the structure is assymetrical, thus may distort the air
flow. Most of the peaks of the wavy pattern in the distribution
are separated by just about 30 degrees.

The large number of high values seen in the last of the distributions,
in the relative humidity, is indicative of incorrect measurements of
this quantity. The next
plot shows the fraction of these higher values relative to all
the measurements in each 0.2-year (73-day) division of the period in question.
Here we see evidently systematic rise of the fraction of higher values
in successive years which suggests something wrong is going with this sensor
(a capacitive transducer with thin film, mounted at the bottom of the station).
At the end of 2007 it has produced data of which as many as 93 % were
equal to 100 %. This is more than three times the corresponding percentage
of 2001 and 2002. Apparently, after about two years since our measurements
started, quality of the sensor begun to deteriorate and presently its
data should be considered unreliable.

To find the median and mode we have analysed distributions with up to an
order of magnitude finer resolution than those presented in the preceding section.*Extreme values: *
Highest temperature: **37.4 °C** on 2007 July 17 at 13:59 UTC

Lowest temperature: **-25.5 °C** on 2006 January 23 at 6:35 UTC

Highest pressure: **1039.9 hPa** on 2006 January 23 at 8:17 UTC

Lowest pressure: **959.1 hPa** on 2007 January 19 at 0:02 UTC

Maximum wind speed: **64.0 m/s** on 2003 July 18 at 12:20 UTC

** Comparison with Koniczynka station data**

The comparisons demonstrate that the first four quantities of the TRAO (Piwnice)
station can be regarded roughly correct. Unfortunately, the fifth one,
the relative humidity looks definitely wrong. There is deficiency of values
below about 20 %, which seem to be systematically shifted to abnormally
high values. On the other hand, the presence of such low values in
the Koniczynka data is also suspect.
Our humidities exceeding 20 % are on average 15 % higher than humidities
in Koniczynka and have excessive scatter (compared to the nominal accuracy
of 3 to 4 percent).
*a* + *b**K, where P is
a Piwnice quantity and K - corresponding quantity at the Koniczynka
station. Besides the regression line parameters, *a* and *b*
(with their estimation errors),
the table contains also the correlation coefficient *R*, standard
deviation about the fitted line *SD*, and number of data used for
the fit *N*.

* Fit performed to Piwnice data circularly reduced to the range ±180
degrees about the line * Quantity a b R SD N*
Wind speed 2.60677 ± 0.01097 0.44951 ± 0.00276 0.63583 1.31908 39063
Wind dir.* 20.18854 0.3538 0.94018 0.00169 0.94239 33.1683 39063
Temperature 0.76268 0.00761 0.99409 0.00061 0.99282 1.05595 38655
Pressure -6.09639 0.38556 1.00570 0.00038 0.99724 0.69618 38136
Humidity** 15.05347 0.20228 0.93661 0.00261 0.93226 6.85464 19324

of perfect correlation.

** Fit to Koniczynka humidities greater than 19 %.
*D*_{i} (i = 1, 2, ... N),
and then taking the arctan2 function of the two means or just sums:

<
*D*> = arctan2(Σ_{i}sin(*D*_{i})/N,
Σ_{i}cos(*D*_{i})/N) =

= arctan2(Σ_{i}sin(*D*_{i}),
Σ_{i}cos(*D*_{i})),
(1)

** Corrected and reduced data (downloadable)**
TCfA Hourly Weather Data
(www.astro.uni.torun.pl/~kb/Reports/Meteo/MeteoSince2000.htm)
Time Temperat. Pressure Humidity Wind Vel. Dir. N
UTC Mean SD Mean SD Mean SD Mean SD Mean
[year] [°C] [hPa] [%] [m/s] [°]
2000.57508 18.8 0.3 1000.2 0.1 92.5 1.5 5.1 0.7 14.8 204
2000.57519 19.8 0.2 1000.1 0.1 88.1 2.5 5.0 0.9 351.7 354

Each data line of this table corresponds to a time interval equal
to 1 hour between
integer UTC hours. Time indicated is given for the center of the interval.
For example, the first line begins with 2000.57508, which signifies the year
2000, and days passed since 0 hours UTC on 1 January: 0.57508*366 =
210.47928, i.e. 211-th day of the year and UTC interval beginning with
the hour equal to the integer of 0.47928*24 = 11.5, i.e. 11:00 UTC.
Note carefully that in the above decoding of time the factor 366
stands for the leap years only (2000, 2004 and 2008); for other years
it is 365.
The next nine columns contain the mean weather quantities
(columns headed '`Mean`') with standard deviations
('`SD`'). The wind direction mean ('`Dir`') is not
accompanied by SD because of nonstandard averaging (which was angular
averaging according to Eq. (1)).
The rightmost column shows the number
of daily file records used for calculation of the means.
_{Piwnice} =
(-6.43728 ± 0.36735) + (1.00604 ± 0.00037) P_{Koniczynka},

** Recommended actions to take***
** Since the humidity sensor produces strongly biased data, it would
be advisable to return the station to the manufacturer for refurbishment
and upgrade of the firmware.**
** Take off or rotate the shielding 'cage' built around the station
to see if the wavy pattern in the wind angular
distribution disappears or rotates in response.
** The external data acquisition software (a product of TCfA, written by
E. Pazderski) should be amended
to perform the initial averaging of the wind direction data according to
Eq. (1).
*The angular averaging has been implemented on 5 June 2008.

Posted 11 March 2008

Last modified: 10 June 2008

Last modified: 10 June 2008