THE RELATION BETWEEN THE DOMAINS OF INFORMATION RETRIEVAL AND KNOWLEDGE ORGANIZATION IN INTERNATIONAL JOURNALS

The objective of this research is to analyze the thematic of international scientific production published during the 2006 – 2015 period on Knowledge Organization and Information Retrieval indexed in Web of Science and Scopus databases. This study develops a qualitative and exploratory research of a corpus of 100 papers that establishes a relationship between Knowledge Organization and Information Retrieval. The study also analyses the papers that are part of the corpus and classifies them into three categories, using the dimensions of Knowledge Organization that are considered themes in the Brazilian Congress on Knowledge Organization and Representation organized by the International Society for Knowledge Organization (ISKO): epistemological dimen-sion, applied dimension, social and political dimension. We found that most of the scientific production is concentrated in the applied dimension, which can be explained by the applied characteristic of Information Retrieval. The epistemological dimension is the second recurrent, and the social and political dimension has low incidence, which can be explained once the corpus is formed by the international scientific production and is not reflected in a dimension designed for a Brazilian Congress.


Introduction
The scientific production of a domain determines its degree of development and evolution.The analysis of a domain is essential to understand it, identify research trends and contribute to the advancement of production and scientific communication.For, "it is through knowledge of the scientific and academic productivity, [...], that one may know what is being researched and how the production of these disseminated research may influence the scientific community."(Fujino et al., 2007, pp. 199-200).
It is understood that the publications in the field of Information Science (IS) -in this research especially on the theme "Knowledge Organization (KO) and Information Retrieval (IR) -"reflect the status of this science and make it possible to examine and assess the Araújo, Paula Carina de; Ferneda, Edberto; Guimarães, José Augusto Chaves.The relation between the domains of Information Retrieval and Knowledge Organization in International Journals.// Brazilian Journal of Information Studies: Research Trends.10:2 (2016) 82-88.ISSN 1981ISSN -1640. .content produced by scientists, as well as trends, theoretical methods and influences."(Arboit et al. 2010, p.19).
KO and IR are key areas in IS studies.The relationship of these two areas is evident, given that information is organized in order to be retrieved by users.Therefore, the objective of this research is to analyze the themes of the international scientific literature published between 2006 and 2015, about Organization of Knowledge and Information Retrieval indexed in Web of Science and Scopus databases.
This analysis is proposed due to the authors' interest on the scientific production in KO and interest in the study of articles relating KO with IR.This research question arose when attending the course Computational Models of Information Retrieval at the Information Science Graduate Program at Universidade Estadual Paulista Julio de Mesquita Filho (UNESP).
Considering that the analysis of the scientific production of a domain or the relationship between domains allows its knowledge, its construction and interdisciplinary, as Rendón Rojas ( 2008) stated, we can state that this is another motivating factor that justifies the conducting of this research.
Following this introduction, the theoretical reference on IR and KO is presented.Then the data collected and their analysis are shown, making it possible to recognize the scientific production on KO and IR.Finally, the conclusions are presented.

Information Retrieval and Knowledge Organization
The term Information Retrieval first appeared in 1951 and was presented by Calvin Mooers, an American computer scientist.In this classical definition, the author stated that "information retrieval is the name for the process or method whereby a prospective user of information is able to convert his need for information into an actual list of citations to documents in storage containing information useful to him" (Mooers, 1951, p. 25).
The author also explain that "information retrieval embraces the intellectual aspects of the description of information and its specification for search, and also whatever systems, techniques, or machines that are employed to carry out the operation.Information retrieval is crucial to documentation and organization of knowledge".
From this definition, and especially the statement that "information retrieval is crucial for documentation and organization of knowledge" one can realize that from the beginning of the studies in this domain, there is the understanding of the close relationship between IR and KO, core domains of IS.
IR is studied by both Computer Science (CS), and IS."In field of information science, information retrieval refers to the interaction of people with information retrieval systems for relevant judgments of the information retrieval results from the selection of the search strategy" (Wilson, 2000 cited by Khapre and Basha, 2012, p 232.).Another characteristic is that "its research focuses on the specific behavior of people search for location information" (Marchionini, 1995 cited by Khapre and Basha, 2012, p 232.).Khapre and Basha (2012, p. 238) conducted a research to carry out a comparative analysis of IR in the field of IS and CS.In their research, the authors have shown that the two fields have different trends.Some theoretical developments turn to integration, others are more static.It is noteworthy that "the similarity of the two disciplines field of information retrieval is more active and cross the field of personalizes, adaptive and implicit evaluation of information use of this techniques has significantly improved the retrieval performance." It is important to note that Information Retrieval systems, regardless of the area, are implanted to answer questions to users, this is why they exist and it is their basic objective, in addition to being their main characteristic, observed during their conception.(Saracevic, et  al., 1988).From this observation, it is possible to set out a reflection about the relationship between IR and KO, as proposed in this research.Guimarães (1990) studied the Thematic Information Retrieval, understood as the one which allows access to information resources through the subject.In this context, he explains that IR should be understood as the link of a chain in the information flow, considering this as its dynamic aspect.In this way, the author understands IR as a data processing procedure along with document analysis.
From another point of view, "the information retrieval process consists of identifying the set of documents (corpus) of a system, which meets the user's need for information" (Ferneda, 2003, p. 14).When representing the IR process, the author indicates the representation of documents as the second step involving the subject analysis of a document, and the translation into a linguistic expression that presupposes a documentary language to ensure the standardization of indexing and effectiveness in IR.Dahlberg (1993) recognized that the need to organize knowledge in ancient times was always closely related to librarians and philosophers.And, over the years, other professionals have been engaging in this activity.Currently, technology professionals are very much present and interested in applying KO methodologies, for example.This involvement in the 90s was due to the accelerated development of information and communication technologies that began to provide new forms of representation and IR.When Hjørland and Albrechtesn (1995) claim that it is necessary to incorporate knowledge on the cultures in which information systems are functioning, they are proposing a socio-cognitive approach to KO.Later, in explaining it, Hjørland (2013) states that one cannot create an operational, transferable and standardized definition of a domain that ignores the historical, social and political issues defining the field.
Hjørland is one of the prominent authors on epistemological and socio-cognitive issues of KO.He explains that KO studies the nature and quality of processes and KO systems.For the author, "Knowledge Organization is about activities such as document description, indexing and classification performed in libraries, bibliographical databases, archives and other kinds of "Memory Institutions"" (2008, p. 86).
Following the socio-cognitive approach, Esteban Navarro and Garcia Marco (1995, p. 149) had a complete definition and, as stated by Guimarães (2008, p. 86), that strengthens the social dimension, materialized and cyclical of knowledge.The authors claim that KO is a discipline devoted to the study and development of fundamentals and techniques of planning, construction, management, use and evaluation of description systems of, cataloging, ordering, classifying, storing, communicating and retrieving of documents created by men to testify, preserve and transmit their knowledge and their actions, from their content, in order to ensure their conversion into information capable of generating new knowledge.(translated by the author).
Barite (2001, p.39-40) discusses about access to knowledge and states that KO seeks to provide conceptual content for various practices and social activities related to them.In addition, "it intends to function as an instrument for processing, management and use of information, comprehensive and inclusive of phenomena and applications related to the structure, layout, access and dissemination of socialized knowledge."Hjørland (2002) explains that the central point in his approach is that instruments, concepts, meanings, information structures, information needs and relevance criteria are established in discourse communities where the communication process is established.It is noted that in this view, there is a change of focus of Information Science, from individuals and/or computers to the socio-cultural and scientific world.
In treating the organization of documents, Jaenecke (1994, p. 8) states that "it's main objective has so far been the ordering and supply of knowledge."The author makes this statement in the article in which he questions the purposes of KO.The provision of knowledge, pointed out by Jaenecke (1994) relates to IR, once information is organized in order to be found by those who need it.And just as Guimarães and Sales (2010), it is understood that the objectives of documentary analysis, one of the areas of interest of KO, are directly linked to the representation of document content and IR.Smiraglia (2002) clearly establishes the relationship between IR and KO by stating that the latter has been the domain of construction of storage and retrieval instruments of documentary entities.He also explains that "catalogs, indexes, and databases have been constructed to allow the rapid manipulation of and retrieval from large collections of surrogate records that represent documents, which in turn represent recorded knowledge" (Smiraglia, 2002, p. 331).
The concepts and the theoretical opinions presented here reinforce the explicit relationship between IR and KO.Chapter three presents the methodological trajectory and next the results related to the literature that identifies IR and KO.

Methodological Trajectory
In order to address the themes of international scientific production on KO and IR, published in the last ten years, a qualitative exploratory study was conducted.It was decided that scientific articles published between 2006 and 2015 and indexed in database Web of Science (WoS) of Thomson Reuters and Scopus from Elsevier databases would be collected.These databases were chosen as they are the most extensive today (Aghaei Chadegani et al., 2013) and as they allow, consequently, greater visibility and certify the quality of articles, because of their strict criteria for the journal selection that they index.
Data collection was carried out on July 30, 2015.The search was conducted in the fields title, abstract and keywords; the period was limited to the years 2006-2015, and the search strategy used was: "information retrieval" AND ("knowledge organization" OR "information organization").In WoS database, 57 documents were retrieved, and 92 in Scopus.The retrieved articles were stored and organized using the reference management software Zotero.
Once organized, Zotero was used to identify repetitions.This first analysis of data enabled the identification of 41 repeated articles, that is, articles that were indexed in both databases, therefore the repetitions were excluded.Although the investigation was limited to scientific papers, the search retrieved: one paper from proceedings, two book chapters, one interview, one book review.Initially, 46 documents were excluded from the research corpus.
The second stage of the research consisted in analyzing the content of the articles by reading the titles, abstracts and keywords.The method of organization and analysis was the categorization, and the categories used were the three sub-themes named dimensions of KO proposed by ISKO in the Brazilian chapter congresses: epistemological dimension, applied dimension, social and political dimension.At this stage, three other articles that were retrieved, but were not part of the scope were excluded.Therefore, the corpus of this research is now composed of 100 scientific articles.

International Scientific Production on Knowledge Organization and Information Retrieval
Articles collected from WoS and Scopus on KO and IR were categorized and analyzed, as according to Sales, Guimarães, Oliviera and Bufrem (2011, p. 1), it is believed that "looking at the intellectual space of a certain area of knowledge in order to understand the movement of the elements that compose it [...] enables greater understanding of the characteristics and behavior of this very area".
We identified 65 (sixty-five) scientific journals in which the 100 (one hundred) articles that comprise the corpus of this research were published.Figure 1 shows journals with 2 (two) or more articles, considered the most representative, totaling 13 (thirteen) titles.The journals with more articles are Knowledge Organization with 11 (eleven) articles and, secondly Journal of the American Society for Information Science and Technology with 9 (nine) articles.
Analyzing the scope of the two journals and considering they are prestigious international publications, one can understand the reason of this result.The journal Knowledge Organization is an international journal directed to the theory, classification, indexing and knowledge representation, as noted in the journal description.Journal of the American Society for Information Science and Technology points out as major areas of research in the journal scope: knowledge production, knowledge organization, design and evaluation of information systems, access and use of information and information policy.Another discussed item was the type of co-authorship that comprise the corpus of this research.Most articles (sixty nine) were published in multiple authorship and 31 in single authorship, as shown in Figure 2. It is possible to note that co-authored publications prevail, and this scenario can be understood from the observation made by Beaver (2004) that co-authorship reflects the generation and exchange of new and current knowledge with which greater authority from the epistemological point of view is achieved, highlighting the solution of common problems.The Epistemological Dimension of KO represents the studies on: conceptual bases; historical bases; methodological bases and interdisciplinary dialogues of knowledge organization.In this dimension, most studies are on conceptual bases of knowledge organization with 18 articles (18%) and secondly on interdisciplinary dialogues of KO (5 articles), as demonstrated in Figure 4. Studies related to this dimension are key to discuss theory that, consequently, shall underpin the practice represented in the applied dimension.The predominance of research in the applied dimension can be explained by the characteristics of the corpus, considering that IR is a process involving application, tests, etc.This does not mean that studies in the epistemological dimension are not necessary, as recognizing and discussing concepts, addressing the history and methods of KO and IR are critical issues.Many studies classified in the applied dimension in this investigation are related to Simple Knowledge Organization Systems (SKOS), to automatic indexing, semantic web and ontologies.These themes can be identified as some research trends regarding the relationship of the areas of IR and KO, once they stood out in the corpus referring to articles published in the past decade in the international literature.
It is suggested, for further research, the analysis of the social network formed by the authors of this corpus, as well as citation analysis, co-citation and bibliographic coupling, which provide a deeper look in these domains and their relationships.This type of investigation enables the identification of strengths, weaknesses, epistemic communities, research trends, methodological options and the comparison with the theoretical references used.
It is also highlighted that research related to social and political dimensions need to be encouraged, since this dimension was poorly representation in this research.This dimension involves important issues such as pro- fessional training, ethics, culture, identity and sustainability.In the research conducted by Bufrem (2015), professional training was the most significant issue in the national literature, followed by the theme on ethics in KO.
Therefore, as suggestion for further research, an analysis in national journal literature related to IR and KO domains could also be explored.Regarding IR and KO in this research, the analysis of the corpus allowed us to infer that there is a close relationship between these two domains in the international journal literature indexed in WoS and Scopus databases and published in the last ten years.

Figure 1 :
Figure 1: Articles per journal It is also important to highlight the significant presence of Brazilian journals: Perspectivas em Ciência da Informação with four articles, Transinformação with three articles and Informação e Sociedade with two articles.These three journals are some of the most im-

Figure 2 :
Figure 2: Type of authorshipThe articles that make up the corpus of the investigation, consisting of 100 articles, were classified into three categories, the three dimensions of KO used by ISKO-Brazil as sub-themes of the Brazilian chapter congresses.After reading and analyzing the content of the titles, abstracts and keywords, most of the articles were categorized in the Applied Dimension of Knowledge Organization, represented by 68 articles (68%).We found 27 articles (27%) related to the Epistemological dimension and only 5 articles (5%) were categorized in Social and Political Dimension.(Figure3).

Figure 3 :
Figure 3: Dimensions of Knowledge Organization

Figure 4 :
Figure 4: Epistemological DimensionAs stated before, the applied dimension stands out among the other categories.This dimension discusses studies of: KO models and formats; KO instruments; KO products and; structures in KO.The research related to knowledge organization instruments were the majority with 26 articles (26%) categorized in the Applied Dimension.The studies on structures in KO have also been highlighted with 21 articles (21%).(Figure5).

Figure 6 :
Figure 6: Social and Political DimensionStudies in Social and Political Dimension of KO are also fundamental.However, the low incidence of articles with this theme can be explained because the corpus is formed by international articles, and, the Social and Political Dimension of KO was established as a sub-theme of the ISKO-Brazil congress because of a national yearning, since the first edition of the congress.5ConclusionsThese results confirm the predominance of research related to the applied dimension of KO.This same result was obtained byBufrem (2015) when she analyzed KO thematic representation in the literature of IS journals present in BRAPCI (Reference Database of Articles in Information Science Journals) and published by CNPq (National Council for Scientific and Technological Development) researchers named PQ1.