Romanenko V.N.

North-Western Branch of Academy of Information Technologies in Education

Gatchina, Leningrad district. Russia

Chairman of Council. Honored Scientist of Russian Federation, professor, Ph. D.

Nikitina G.V.

North-Western Branch of Academy of Information Technologies in Education

Gatchina, Leningrad district. Russia

Vice-Chairman of Council. Honored worker of science and education of RAS, professor, Ph.D.
To find necessary information is frequently an art. Therefore, many types of activities in this field can be created only through practical work. The theoretical explanations in this area are useful an introductory part of instructional technology. The main KSAOs can be created only through practical training. To organize such process, it is necessary to compose a set of interesting problems. Instructors must control all stages of the learning by trainees. Discussion in reciting classes is the final part of the proposed educational technology. Some interesting real situations are given as examples of such a strategy. This method is useful for practice in fact checking as well.
Keywords: Collective learning, Cultural background, Data mining, Fact-checking, Knowledge discovery, Market based analysis, Reliable data.

All beings in the World are connected with each other via different types of interactions. There are three basic flows of matter which realize these interactions. One of them is the flow of information. The animated objects actively react on all external impacts. Animal memory consists of inherited and acquired information. Animal brains store information which reflects its personal experience. Yet, only humans can store, transform, and transmit the experience of other animals and previous generations. This specified behavior of human brains emerged as a result of a long evolutionary process. The resulting ability to accumulate the experience of various persons is called Collective Learning [1]. On its basis, the system of humankind knowledge arises. Its generalized representation is given in Fig 1. The The Primitive Core of Humankind Knowledge stores data which all people of various cultures and educational backgrounds possessed. This core includes information about weather


Fig. 1. General structure of humankind knowledge
and seasons, differences between fluid and solid stuffs and so on. The major part of this core is developed at an early age in each person. The Basic Core is culturally and educationally specific. This part of knowledge constantly changes with time. Our goal is to discuss some parts of the Basic Core which each educated person has to include in his (her) professional background. Literacy, basic mathematics, history and languages are few of them. The ability to work with information is an essential part of the Basic Core of educated person. Problems of studying information processes are different. Many of them were included in educational strategies during ancient times. It is known that processes which are called Information Revolution or the Third Wave[2] started approximately in the second half of 20-th century. As a consequence, the flow of new information that people get in modern society increased dramatically. As a result, the classical forms of information work have changed greatly. Some new forms which have never been studied previously are emerged. A modern educated person should be able to find, store, and provide information. Nowadays, these traditional tasks have become much easier, mainly because of their automation. At the same time, the flow of information significantly increased the new issues put on the agenda. The most important of them is the ability to correctly classify the found data and the ability to determine their reliability and validity. The second of these tasks was never included into the curriculum [3]. This problem is the main goal of this issue.

The main problems to be solved in selecting the right information

Various forms of Internet searching permits people to quickly and efficiently receive large amounts of different resources. It is practically impossible to fully familiarize yourself with their data. Educated people must have the competency which helps him (her) to sort the found resources in order to cut off the content which is not needed. Necessary KSAOs for such activity should be developed from the first days of training. The average freshman usually has some practical skills which were created in high school and supported during various types of contacts in social nets. Practical studies and different testing procedures, some of which are described in [3], show that these skills are superficial and poorly justified in terms of theoretical support. At the same time, all sophomores do not have sufficient motivation for serious studying of these problems. Therefore, optimal Instructional Strategy should be divided into several successive stages. It must start with a small introductory course. The next stage is everyday practice. Necessary KSAOs are created there as a result of hidden processes. The last stage is final theoretical lectures at the pre-diploma period. All teachers must be involved in this process. Specially trained librarians should help students during their learning. Instructional strategy should incorporate periodical discussion of the results in reciting classes.
The first stage of training must develop practical skills for estimating quality of data sources. The simplest activity is connected to an everyday household search. It means searching news, personal addresses, shopping venues, and so on. The trainee has to cultivate a set of sustainable skills. The first of them is forming a habit of checking each new contacting person. Before reading a new resource or receiving an interesting letter, one should read all information about its author which is available in various Internet sources. It covers more than a set of formal data that an author gives in social nets. Publications, comments, and participation in different meetings are more informative. Detected inconsistencies and inaccuracies in the data have to provoke a cautious attitude to a new person.
The second habit which has to be created at the first stage of training is an evaluation of different sources of information. The necessary skills can be developed through getting acquainted with various news portals, newspapers, and so on. Experienced instructors can show that each source of news has certain periods for updating. Some of them are very short. In some cases news appears and then quickly disappears. This suggests that this news has not been confirmed or it is false. In such cases it is wise to stop using this source which may not be reliable.
For an educated person it is necessary to have the ability to evaluate references and scientific data. In most cases, when searching scientific data people turn to articles in Wikipedia. It is simple and convenient. However, the articles in Wikipedia are not written by experts. For serious research, this approach is not valid. A student must realize that the sources of reference like this are useful only for initial familiarization with a problem. One needs to understand that such sources are valuable primarily because they usually give references to important and proven professional works on the subject.
At the same time, familiarity with the peculiarities of Wikipedia allow trainees to understand the principles of sophisticated professional research information. During practical sessions or in the course of analysis of independent students’ work, the instructor can demonstrate how a statistical analysis of a number of articles on a particular problem allows one to estimate the interests of society in various issues. Each instructor must store several examples to demonstrate them to students and groups during discussions. It is correct to include the detailed study of this problem in the curriculum at the pre-diploma period.
The stage of preliminary training in computer assisted information search should be devoted to familiarization of trainees with search operators, specificity of search engines, and the main methods of searching. Students’ competency in this area must be estimated with the help of testing procedures, preparing different resumes and micro investigations. At this stage, a student must clearly understand and follow the basic rule: Good search result is only possible with a well specified question. For this reason, high quality searches can be ensured only if the search issues are consistently specified on the basis of the obtained results. It is called the method of successive refinements.
At the final stage, each student has to know the main problems connected with effective selection of reliable information from large data arrays. Its enumeration and brief description will be discussed in the next section.


Enumeration and classification of factors which determine the reliability of found sets of data

There are several major groups of reasons leading to the emergence of erroneous and unreliable data. The first group combined the reasons which are created accidentally. Different mistakes, misprints, and misunderstandings are a few of them. The second group has incorrect data associated with incompetence of authors or people who did the search. These groups of unreliable data may be fully or partially fixed using standard techniques. All of these methods are traditionally studied in curriculums for various disciplines of future students’ activities. Behaviors of different data sources are described during training of all studied subjects. Therefore, a normal student curriculum supports hidden processes of creating the main part of necessary competences.
Opposite this, two other groups of unreliable data are the results of actions aimed at the preplanned introduction of wrong information. Training methods for detection and suppression of such data are described below. The goals for creating unreliable data can be divided in two parts. The first one gathers different forms of trolling, jokes, and deliberate misrepresentation. These actions are widely spread. As a rule, they do not use any serious professional technique. For their identification it is enough to perform a series of skilled actions. The next part of incorrect data is a result of criminal or political goals. Many of them are professional and their detection is serious problem. Detection and understanding the mechanisms of these actions is difficult. In most cases these actions are based on special software. Therefore, methods of their detection are beyond the field of our issue. The authors do not want to engage this didactical material in political and criminal problems. In our opinion, it is not correct to expand educational practice to these questions or to include various instructional strategies in traditional curriculum for its suppression. So, further discussions will be devoted only to detection and elimination of incorrect data, which is not connected with software operations. We have no intention of studying the ideological and mental base which stimulates the actions of creating false data.
Let us try to estimate reasonable instructional technologies which can create necessary competencies in the field of verification of data reliability. One knows different strategies for their detection. First and, certainly, the simplest question needing to be taken into account is how the found data is complied with the searching request. Answering this question is a classical problem of Library Science. There are six numerical coefficients which can describe efficiency of the searching process. They are: Recall, Precision, Resolution, Elimination, Notice, and Omission. Only the first three of them are actually used in everyday practice. They are usually studied in all serious theoretical lectures devoted to information searches on the Internet. Everyday practical work under tutorial guidance creates primary competencies in this field after two or three educational years. At this period a trainee must understand that the main problem in building a searching strategy in each case is connected with answering the basic question: What is better: to find all or almost all correct answers and additionally receive a lot of invalid results, or to quickly obtain one, but very reliable answer. The instructor has to explain to the students that the answer to this question depends on the type of solving the problem. In different situations various answers are possible.
The second serious problem is tied with the selection of the desired data in the stream of ordinary day to day information. A standard example of this type of problem is the process of identifying information about an average composition of a package of a group of purchases in a store. The study about its dependence on the time of day, and the day of the week enables one to build a more effective logistical schedule. One considers these studies as Market Based Analysis [4].When facing such problems, the task of the instructor is to highlight various situations experiencing similar problems. A trainee must understand that for the solution of such problems he (she) needs to gain serious practice in creating the necessary software. One calls these strategies Data Mining [4]. The methods spread in this area are practically the same that are used for detecting new information in various data bases. This field of searching technique in general is known as Knowledge Discovery or Knowledge Discovery in Data Bases – KDD. Training for creating necessary skills is a mandatory part of specified curriculums for future programmers and mathematicians. An average educated person needs to obtain only a general idea about this subject. Therefore, the main ideas of KDD are usually studied only in the course of introductory lecturers. Practice in this field is never included in the real world schedule.
Yet, this traditional instructional strategy has to be added with another important topic. Let us return to the example with using Data Mining technology in a big store. Information which is detected with KDD is of interest not only for logistics. It is also useful to suppliers, trading floor managers, advertising specialists, and even cleaning companies. At the same time, programs for KDD are developed and implemented by experts in programming. Experts in information security are also involved in the process of detecting data. Thus, information, which provides a certain income, is really known to a number of various independent institutions. As a result, the relationships associated with the processes tied with KDD are transferred to the legal field. Thus, there is a need to include theoretical and practical training related to the study of legal relationships in a schedule. Consequently, the Universities are faced with the task of training skilled experts in this specific area. Based on the preliminary experience, the authors can say that it is better to arrange such special training during the pre-graduate and post-graduate period.
One has to understand that many problems studied this way have more general interest that it seems when someone faces them first. Let us again return to the market basket analysis. One of typical questions which is taught across the studying period can be: Which items are frequently purchased together by customers? It is not difficult to understand that this question can be reformulated in such a way that it will be interesting to organize the work of libraries. In this case it will go about identifying those books that are frequently demanded by large groups of readers. Mathematical programs for solving both problems are almost identical. Therefore, the KDD technique and its learning are based on Universal Theoretical and Practical Core. This core includes learning statistics, features of data quality, typical errors, and other tasks, which are taught in standard courses of experimental technique. Common properties of KDD analysis determine the place of relevant subjects in modern instructional strategies. It means learning of KDD is connected with serious and long term co-op training. Opposite this, most incorrect and improper data requires different approaches for its detection and protection from its harmful effects. As a consequence, its development requires a different instructional strategy.


Fact checking and cultural background

The number of errors, inaccuracies, and deliberate distortions of facts, which any person seeking information, has to face is very high. Among them, errors and the various inaccuracies associated with the lack of professionalism of the authors of the resource are random. A large number of incorrect data appears also due to the lack of ability to organize the search and quickly determine the presence of suspicious or obviously incorrect information. Specially prepared erroneous data, various fakes, jokes and provocations are also encountered when searching for information. Specially pre-planned actions for misleading a reader can also be found in everyday practice. However, experience shows that their role in our political life is not very high. We do not know any data about preparation of special computer programs and technologies for mass spreading of incorrect scientific data or advertising information. Therefore, errors and inaccuracies in this segment of the Internet are very individual by nature. At the same time, they are extremely diverse in nature. As a result of practical strategies and for the detection and exclusion of invalid data this plan is characterized by a large variability. According to Polanyi ideas [1,5] it is possible to say that KDD is based on the Explicit Knowledge. Opposite this, detection and suppressing of individually produced incorrectness and mistakes must be developed by special practical training, which is necessary for creation of the Tacit Knowledge. Accordingly, the development of the necessary competencies must be must be made based on the decision of various practical tasks. The emergency of new knowledge in the trainee’s brains is a result of self -generalization of his (her) personal experience.
For the formation of competencies based on various types of knowledge, one needs different instructional strategies. The strategy for dealing with the incorrectness of the individual’s origin should be based on verification of origin and the correctness and reliability of the found data. All found data, ideas, and references are defined as facts. All types of technologies, and recommendations as a whole used in this work are denoted as Fact Checking. One understands that it is a pack of practical competencies, which are the universal basic core of new specialisation. The need for educated experts in these highly sought after professional activities has dramatically grown during the last decade. Therefore, the critical overview of primary results in this area is urgent.
It is known, that the Tacit Knowledge needs a set of long and varied practical training to create necessary competencies. The problems, which are connected with detecting incorrect data, depend on the specific field of knowledge and traditions of the social group. They also heavily depend on the experience of a person conducting the relevant testing. For this reason, training problems are content specific. It depends on national traditions too. Therefore, it does not seem realistic to create a universal set of learning tasks which can be used in all possible cases. During teaching of the relevant disciplines, the authors have accumulated a large collection of illustrative examples for practical exercises. All of them have been repeatedly tested in practice. We are going to publish them in a special textbook. These materials are primarily meant for Russian students. Therefore, we give here only a few examples, which are simplified to be interesting for instructors and students from various countries. In these examples we highlight only basic problems. Our choice of narrated situations was accidental. It is for this reason we excluded personal information about the people related to the materials we are presenting to the readers.
Example one. In fall of 2016 one very solid Russian language journal, issued in Germany, published a discussion between two authors of Russian origin, who now live outside Russia. The discussion was conducted in a borderline rude manner. One participant wrote under a pseudonym, another appeared under his name. The first discussant accused the second one of impropriety and claimed that his information about the thesis in the period of his life in the USSR is false, and the company mentioned there as a place of his work in the United States, did not really exist. They also gave some information about the dishonesty of the second participant. The second discussant responded sharply. However, he could not refute the charges against him. Editorial position shows that editorial staff made this mess because they did not thoroughly did fact checking. Both disputants were also completely helpless in this regard.
At the same time, it is easy to conduct a qualified verification of the information discussed on the website. It is enough to refer to the electronic catalog of the Russian National Library. Entering, as a query, the name of the second participant you can see and copy the card of his Ph. D. thesis. It also indicates the number of his publications on the subject: 6 items. If the participant, or a member of the editorial board had done this, all disputed issues would have been instantly solved. Moreover, if the checking had turned to the catalog of the Russian State Library, the answer would have been more interesting. To find there the card with Ph.D thesis is more difficult. Yet, the query with the name of the second participant enables us to find all references to his publications in various Russian sources. All of them are not serious articles. They are published in everyday newspapers. These materials support the position of the first participant of the discussion that the second one has been very far from the serious science for many years.
Some words about the company of the second participant of the discussion. The company he mentioned is a real institution. Yet, its name coincides with the name of the other USA company, which is more serious. A simple search by the company’s name gives the address of only this larger company. It was found by the first participant of the discussion. Yet, a simple refining of the search query, such as adding the person’s name or location of the company, immediately gives the necessary address, occupation of its owner, and the information that this company has only one permanent employee. It means that the first of the participants is not able to search on the Internet. However, the second does not know what information about the company should be given or what information can be easily revealed. In this situation we can recollect the words of Sir Thomas Moore about such situations: Many are schooled but few are educated.
Example two: In January 2017 many people received via Internet two or three texts which were written by an Israeli journalist and correspondent of one Russian language news resource in NYC. It was a retelling of material from 9-TV Israel TV channel. Clearly provocative material contained a story about the world-famous Ukrainian poet and writer Taras Shevchenko who, before his death, secretly immigrated to Palestine via Iran. It was alleged that he had died at the end of his journey, and nowadays his grave draws visits of eager crowds of strange Ukrainians who roast pigs and drink home-made alcohol on Israeli territory. Despite the obvious absurdity and frankly boorish style of the presentation this material created a wave of discussions. The identity of the author of this provocation is not interesting. We are struck by the helplessness of the head of the TV channel and editors of the NYC Newspapers who failed to conduct the basic verification of the reliability of the data spread by them in this case. The author of this information has the text of the Psalms of David in the Ukrainian language, which was allegedly translated by Taras Shevchenko during his last trip to Israel. Simply copying this text and entering it in the search query one can find the time this text was written and find a book with this text, which was printed in the City of Vinnitsa about 20 years before the journey allegedly occurred. One can perform more sophisticated checking of at least three points of this material. Each time the result is the same: the author does not have the necessary knowledge. We do not intend to discuss the moral responsibility of the leadership of the mentioned media. Our goal is to show what problems arise when checking the reliability of information and how they are related with the gaps in the cultural background of many authors.
Some more complex analysis of several stories which demonstrate the methods of fact checking can be found in [6].



1. The brief analysis of the incorrectness of data showed that it can be divided into two groups. The first is tied to programmed activity in the Internet. To suppress it, it is necessary to teach future experts in the field of specific content scheduling. It must create the Explicit Knowledge in each trainee.
2. The second group of activities in the field of detection and suppressing of incorrect data is tied with creating the Tacit Knowledge. Its development is possible only through special practical training. It is necessary to store content specific groups of examples to give them during the training.
3. All students must have practical training in the area of fact checking at the pre-graduate period. This idea is also implemented into the K-12 learning in several countries [7].
4. There are a few legal problems which must be taught to future experts in the Data Knowledge Discovery, Data Mining, and Fact Checking. There is an urgent need to develop new textbooks and computer assisted exercises in this field.
5. For correct orientation in the quality of various data, each educated person should have a good cultural background.


1. Romanenko V., Nikitina G. Developing knowledge: Spiraling ways for individuals and society. American Journal of Science and Technology, vol. 3, # 6, 2016, pp.174-89.
2. Toffler A. The Third Wave. 1-st ed. New York: Morrow, 1980, pp. 544
3. Romanenko V., Nikitina G. Theory-oriented curriculum at the tertiary level Sarbrücken, Deutschland: Lambert Academic Publishing, 2016.pp 150.
4. Han J, Kamber M. Data Mining: Concepts and Techniques 3-rd edition, Amsterdam and Elsvier. 2012. pp744.
5. Polanyi M. ”Personal Knowledge. Towards a Post-Critical Philosophy.” Taylor & Francis e-library (2005) [online][25.08.2016] Avaiable at: .
6. Романенко В.Н., Никитина Г.В. Информация и преподавание. “ПОЛИТОН”.
СПб: 2017 . 84 с.
7. Гриневич Л. Квалификация учителя – самый большой вызов для нас.
(2017) [online][26.01.2017] Available at: — ministr-


Об авторе

Галина Никитина


Лимит времени истёк. Пожалуйста, перезагрузите CAPTCHA.