Data preprocessing for explorative visual analysis

alm · January 4, 2020, 5:27pm

Dear all,

In visualizations today, considering the big amount of data that is being processed and needs to be visualized, the data is usually preprocessed and slightly changed before we get the final values that will be visualized. This includes processes like data cleansing, smoothing and filtering, clustering, normalization and so on.

Anyway, many visualizations are used within an explorative context or to confirm a hypothesis about a particular data set, so does it really make sense to change the data in any way before visualizing it, in these cases?
Obviously, very frequently data needs to be fitted to visualization capabilities, but even if making clear to the user that the data is slightly changed, important facts about the data could be missed since the user most likely won’t have a real picture of the data through the given resulting visualization. So if the user has already a build up personal picture of a data set, he can easily miss and forget about the visualization not showing the actual unchanged data set.
On the other hand, if the data is not preprocessed, the visualization could end up in a mess, which cannot easily be made sense of and analyzed.
What are your thoughts on this? Do you think that there could be a solution which would make it easy to analyse the data but also show the exact real data?

Thanks very much in andvance!