Welcome to Part III of 'Bibliographic network analysis in VOSviewer and CitNetExplorer In this session, I'll guide you through how to download and install CitNetExplorer on your computer, how to import the data that you've retrieved from the Web of Science into this tool, and I'll give you a brief overview of what this tool can do. Now I like CitNetExplorer more than VOSviewer. I think the options available within CitNetExplorer and more impressive. It includes the capability of analysing not just the papers that you've retrieved and the citation links between them, but also the citation links between the papers that you've retrieved and all of their cited references. So with this tool, we can try and find literature that is used by the literature that we've found, but that we haven't captured in our query. For citation network analysis, I think it is a better option, but it is slightly more complicated. So I'm going to divide this part into two. The first part will just be a basic introduction. But in the second part, I'm going to show you how to interact with the data that CitNetExplorer will produce for you in order to generate some interesting results. Now for both VOSviewer and CitNetExplorer, I've written a step-by-step guide to analysis in these tools. And this is paired with the 'Introduction to citation network analysis' document that is in the reading list. So I think that these, the guide and the guide to citation network analysis, will really help clarify how to perform analyses in these particular softwares. Okay, let's get started. So first we need to download CitNetExplorer Here are some download instructions. Here on the slide. I'll put the same instructions in the reading list. But the easiest way for me to show you is just again, to go to my browser. Sorry, my voice is a little croaky today. So CitNetExplorer is a tool developed by CWTS as Leiden University. This is the same group that developed VOSviewer. But this tool is developed specifically for analyzing citation networks. Whereas VOSviewer is a more exploratory tool designed to allow you to perform different bibliographic network analyses very quickly. But it is very limited in what it can actually do for you. CitNetExplorer is different. So I recommend reading through the site if you're going to use this tool, there's a lot of good advice. There's also some warnings here on common problems with importing data into CitNetExplorer. I've covered all of these in the guide that I've produced for you. But we'll just get started here and go straight to download. Again, very similar to VOSviewer you choose whichever option is best for your system. You can also launch it from the web, which I've never tried before. So as I'm a Windows, I'm going to download it for my windows system. Okay, so once you've downloaded the tool, it will appear in a zipped folder. All you need to do is press 'extract all', and then specify where you want this folder to be. I'm just going to save it on my desktop. Again, I don't recommend saving folders on your desktop, but it's fine for what I'm doing now. There we go, and it's open. You might get the same warning message that you get from VOSviewer. trust these tools because they were produced by a very respected University and group of academics. So if you do get the warning message, I would ignore it. So it's a launch CitNetExplorer What we do is double-click on CitNetExplorer. And it'll open the tool for us. So when you first open the tool, you'll be automatically asked what files you want to analyze. And this works in the same way as for VOSviewer's import options. So here I'm going to find my project folder that I've stored my Web of Science data in. And this can read both the plain text and tab-delimited files like VOSviewer. And here I'm going to analyze my intranasal oxytocin and schizophrenia literature. So the smaller network for now. Again, if you wanted to analyze multiple files, CitNetExplorer can merge these for you. But I don't want to do that, just now. So ,I'm going to press OK. Now below this, CitNetExplorer asks you something that VOSviewer didn't. It asks you whether you want to include non-matching cited references. And what this is asking you is whether you want to include the full bibliographies of all the papers that you've downloaded. So here I have a 174 articles and reviews. And within these papers there's probably 10, 15, maybe 20 thousand references in those papers. So that's a lot of data to analyze, and I will show you how to do that after this. But for now, for simplicity, I'm going to ignore this. But I will get on to this because I think it's a brilliant option. CitNetExplorer has. So I'll press OK. And now it's loaded my intranasal oxytocin and schizophrenia network into CitNetExplorer. So I'll begin in this corner of the screen. So this little box here tells me that there's 174 publications in my dataset that these are connected by 1285 citation links. Now I think this is six links less than VOSviewer found. And this is explained by the fact that VOSviewer has a slightly more advanced cleaning algorithm behind it. Sometimes I'm a little bit suspicious of the cleaning routines, but for your purposes, it doesn't matter. This is a very small difference. Below this, you have the time period. And this tells me that my 174 articles and reviews were published between 2009 and 2020. Now, in the visualization, you'll notice that there's clearly not 174 publications here. CitNetExplorer, by default, will first visualize the 40 most cited documents by other documents within your network. So these are the 40 most cited papers by these 174 papers. So it is a relative citation count. If you want to change this, you want to see more papers. You can go to 'update publications'. And you can increase this to about a 100 before things get a little bit stressful for CitNetExplorer There we go. Now there are ways of looking at all of your data, which I will get onto. But for now we'll stick with this. As is the same in VOSviewer, we can zoom in to the network and we can zoom out. If you don't have a scroll thing on your mouse or keyboard. You have the navigation panel here, which you can zoom in or zoom out. And you can also move to the right and the left. Similar to a VOSviewer, you can also open these in your internet browser. To do this, we right-click. Then we open publications in Web browser. What paper is this? This is a nice feature. You can very quickly examine the literature this way. But you'll notice that this network has a very different structure to the way the VOSviewer structures bibliographic data. And this is because CitNetExplorer will position nodes on the y-axis according to their year of publication. So it includes a temporal dimension to citation network visualization. And I like this feature and I think it's, it's a good feature. Now papers are positioned horizontally on the x-axis. In a similar way to how the force directed algorithm for VOSviewer positions nodes. So it will cluster papers together into areas that are densely interconnected, and it'll push away from each other nodes that are not within parts of the network the cite each other. So it's a very simple visualization. Down the y-axis, we have time. On the x-axis, we have clustering. I prefer CitNetExplorer for lots of reasons. I think one nice feature of it is that, because it has this temporal dimension, it's very easy to see the difference between citations and references. So if you hover over any particular node, you will see all the nodes that it is connected to, within the set of nodes that are loaded into this visualization at the moment. And here we see the Bales references four papers. Because the links are pointing backwards in time from Bales to these papers, so it can't be cited by them. So these must be reference links. Whereas all the links that point to Bales from a point in the future must be citations. So it's very easy to see which documents cite Bales and which documents Bales references. This I think is a major advantage of this particular tool. So below the visualization, you have options to alter the layout of your graph. I don't think you need to do anything here just now. You can play around with these options. Now, you can increase the number of publications per layer. And, and so this will make the years more tightly packed. You will notice that some years are longer than other years. And this is because more papers are published in these years than in 2009. So the y-axis will adjust to accommodate more papers that are published within that year. Now, if you hover over a node, you will see in this box here in the bottom left-hand side of the screen information about this paper. So this tells me the last name of the first author, the title of the paper, the Journal of its publication, and the year in which it was published. And the cite score here is the number of citations from papers within my set of 174. So it's relative. And is citations. It isn't references and citations collapsed together. And this is another reason why CitNetExplorer is much better than VOSviewer. To the bottom here we have some more visualization options. And I think these are the ones you want to use. So we have the ability to change the size of nodes and they're labels. You'll notice that the labels are just the last name of the first author. You'll also notice that some of the label names disappear. This is just because CitNetExplorer is trying to maintain the readability of the network. These have labels, but the scale at moment means that CitNetExplorer is turning them off. You can also change the vertical spacing, so that you can make the network more elongated or fatter. You can change the label from the last name of the first author to the last name of the last author. You're almost always going to want to use last name or first author. Now, below this, you have the option to turn on or off horizontal lines. Now, you always want to. You're going to want to keep these on. Now, these are papers that are cited or referenced by another paper published in the same year. So, yeah, keep it on. Citation links. You're always going to want to just put show direct. This is just a direct citation link. And so if one paper references another, there'll be a link. And this means that you can interpret the network very easily, it means that here, Bales cites Feifel. There we go. This isn't important for you just now. I'll show you what, actually I will show you. Now below, this is an option to change what happens when you hover over a node. I'll show you what's happening just now. Now these just show who Bales references and who cite Bales. If I change this to all predecessors and successors. Here you see something very different. And this is every paper in which there exists some path, or chain of citations, in which you could take to get to Bale. I don't think this is useful, don't do it. So we want to choose direct predecessors and successors. We can also view the data via the publications window. So we were on citation network, we click on publications. And here you can search for authors, titles of the paper journal of publication, the year in which they were published. You can specify a year range to search through. You can also search through a minimum number of citations and also the range of citation. This group thing, this group option will make sense later. You don't need to understand that right now. So what data do we have here? We have the authors of a paper in this column. We have the title of the paper in this column. We have the journal publication in this column. And then we have the year of publication in this column. We also have a column called cite score, and this is the total number of citations from our 174 documents to this document. So here, Pedersen has been cited 88 times by these 174 articles and reviews. So about half of all papers cite Pedersen's paper in this set. So it's remarkably influential. So this is nice. You can interact with your data. You can see the distribution very easily of citation here. You'll notice it's extremely skewed. This is typical of citation networks. In the step-by-step guide to CitNetExplorer and VOSviewer I talk you through how to analyze this data in more detail. It's very hard to do this in a video. I'll try my best. Now if you want to just view some of these papers, you can tick them. Say, I only want to look at these. I only want to look at any paper cited at least 30 times. Now these will be selected in your Marked publication list. And in the selected publications, you'll notice there is slightly more. And this has to do with this box here. So you have options here to change the way you select papers based on whether they're in your Marked list, based on time period, which we'll go into later, and based on groups, which we'll go into later. So if I stay here on, based on Marked publications. Below this, it is just, just keep this, You don't need to worry about it. But below this, your asked whether you want to add intermediate publications. And this explains the difference between these two values. So we've got 13 papers in the Marked last 18 in the selected publications. Now the five extra papers are these intermediate publications, and intermediate publications are papers that exist on some path between these papers. So they're not selected. But they're directly linked in some way to some of these papers, and they form an important citation link between them. Now, that might be hard to understand. So I'll show you this. Now., if you want to just focus on these papers that are here by these criteria, then you go to 'Drill down' And this is a very important function in CitNetExplorer. There we go. And now it's kept the papers in my Marked list, well my selected publications as well. Now this visualization contains 18 papers. And there's also 80 citation links that these papers are responsible for. Now this is not the 13 papers because it includes the intermediary papers I was talking about before. So let's have a look at the ones that we actually chose. So we chose every paper that was cited at least 30 times. You'll notice that the cite score doesn't change because it's calculated by the whole network uploaded initially. Now when you click on these, when you select a paper, a square will be put around it. Now if if press 'Drill down' now, I've now unticked the intermediary publications. You'll see if I tick the intermediary publications, they appear in circles. So these are papers that are connected well to the papers that I've selected, but that I've not selected. And it's asking me - Do I really want to exclude them? And I do, so I can remove them by unticking this. And then 'Drill down'. Now I've got only the papers that I selected and the citation links between them. So I think this is useful. This allows you to really ask questions about your data. So you can just ask, does this paper cite this paper? And you can do this with a massive dataset very easily. Now if you want to get rid of this 'drill down', then you press 'Full network', which is here. And this will return you to your original network dataset. So just to show you that again, maybe you're interested in just the relationship between these two papers. You can press 'Drill down' . And you see that there is a citation like between Okay. So I think that makes sense. As I say, the manual that I I've written for you goes into more detail about all of this. This is just an introductory overview. You can also flip the network, if you want to flip the network. I don't. So, now here we've got some analysis options. We can look for connected components. And this, you'll remember, is similar to what VOSviewer asks you when you first import your data, it says there is X number of nodes that are connected within a large component. And whether you want to keep just this and discard the others, or whether you want to view all the nodes. And this is the same. Here. It is asking us whether we want to identify the largest component. It won't discard any other component here. And it is also asking whether maybe you want to, if you have more, the more component in your data set, do you want us to find these components and store them into groups so that you can look at them. So if we had our Greater Prairie Chicken example in here and we had our intranasal oxytocin and schizophrenia network there was no citation links between them. So they would constitute two components, and then it would assign them colors. And we could search for those publications within the publication tab. Now, we know there's only one component here. And that it is circled by a few isolates. So if I do this, it tells me a largest connected component consisting of 168 publications has been identified. The publications have been assigned to group one. This isn't surprising. We knew that there were six isolates. Isolates, by definition, can't form a connected component because they're not connected to anything. Ok. Now that's pretty useless because there's only one component here. Now , I want to clear all groups just now, and this will get rid of the coloring and the data that has been assigned to these that indicate their group membership. So I'll clear this for the full network. More useful to you will be clustering, most of the time. And this works in the same way as VOSviewer. Essentially what you want to do is create clusters out of your network that makes sense to you, and that you think are a good fit for the papers content. It'll work by clustering together papers that are densely packed together via citation links and it will do this across the whole network data sets. So we have a 174 papers here and connect by 1285 citation links. So it won't just cluster the graph you see in front of you. It will be clustering the whole network that sits behind this visualization. Here, I'll just leave the parameters as they are and press OK. So this is telling me that it's detected two clusters ,and thatdue to the default settings that I haven't changed, 22 publications don't sit within a cluster. This is because I had, um, it had by default, begins with a minimum clustering threshold of at least ten papers within a cluster. So we know at 22 papers didn't meet the threshold and they've not been assigned to a cluster. Again, you're going to want to play with this so that you form clusters that you think make sense to you. Read the manual for this. CitNetExplorer goes into detail about all these. Take a little bit too long here. But I will show you. That's fine. We've got two groups here that I can show you. Some of CitNetExplorer's other functions. So you can 'Drill down' based on groups now. And we know that here it says that we've got two clusters, one of size 132, so it has 132 nodes, and one of size 20. Now, if you just want to view just this cluster, you just click here. So in this options you go down to based on groups. You click the cluster that you want to view and then you press 'Drill down' And because there's only 20 of these, we can see the connections between all of them. I'll do this for the other one just to show you again, now because of the 100 paper threshold, we will only see the 100 most highly cited papers. There we go. Again, clustering will likely detect papers focusing on very similar research questions. Clustering will also tend, if you are looking at a network where you have publications over quite a long time period. It might cluster together papers from the same decades, for instance. And this is because the structure of citation has a time dimension to it. Most papers will accumulate most of their citations within the first five years of publication. And after some peak citation, which typically happens about between 2 to 4 years after publication, the probability of a paper being cited again falls off exponentially from that peak. So this means that papers that are published in a similar time period tend to cluster together. Now here we have a very short window. We have 11 years. Time is probably not structuring our network. in an important way (in terms of clustering). Now what else do we have here? We have shortest and longest path. Now this will make sense to you, if you've read the 'introduction to citation network analysis'. It is not really, it's not hard to understand. So a shortest path in network analysis is just the shortest route from one node to another. So what if we click on Strauss and Goldman? So here they've got the squares around them. Now I go to shortest path. So this will tell me what is the smallest number of edges that need to be traversed, if you were to begin at Strauss and follow Strauss' reference list to find another paper that eventually references Goldman. So you can change the colors here. I'll make them yellow. So here, this as told me that I could go from this paper, to this paper, to this paper, to this paper, to this paper to reach Kaufman. Right, so it is a citation path of length 4. So within four links, I can get. You can also look at the longest path This is probably less interesting for you. So the longest path is the exact opposite of the shortest path. What is the longest route from Strauss to Goldman? Now paths can be interesting to analyze because they can suggest how a finding is potentially spreading in the literature. If you're gonna do this, play around with it, and see if you, if, if what you're looking at requires this. Okay, so I think really that is all I want to show you in this quick introduction. I will show you however, this option. So before we kept the network to just our 174 papers, and just the citation links between those papers. Now I want to look, to construct a network out of the bibliographies, the full bibliographies of those 174 papers. So any paper they've ever cited, whether or not I captured them and my query or not. Now, in the guide, I will tell you that some errors can happen here. So it'll crash CitNetExplorer if you try and load in data that doesn't have a year of publication. Now this can stem from early access publications from your query. And these don't have years of publication. Or, in this window, it can crash CitNetExplorer if if references don't have a year of publication contained within them. And this can sometimes be relatively common for 'personal correspondence' references or 'in press' references. So when scientists cite a paper that they know is going to be published, but don't yet know the date in which it will be published. So here I've ticked the 'include non-matching cited references'. And now it's asking me for this threshold. Now because of the errors, and because I know these errors exist here in this data set, for very low cited non-retrieved papers, I'm going to create a network with a minimum number of citations to non-retrieved documents. So what this is doing is it's taking the full reference lists of my 174 papers, and then it's counting the citations to those specific references from my set of 174. So what I'm asking here is that I only want to see references that have been cited by at least two of my retrieved papers. My step-by-step guide will go into this in a lot more detail. So please do read that. Press OK. There we go. Now we see something different. The publications have now increased to 1974. The number of citation links is now 9,322, and the time period is 1906 to 2020. This is because I have included the full reference lists that have been cited at least twice. So here we have some papers that we have not retrieved from our search. So although we haven't, we can find some of these in your internet browser despite the fact that you haven't retrieved them formally. This is because a lot of references in Web of Science will have a DOI. Now these papers might be directly linked to your research interests, but they might not be, they might be classic papers that are referenced a lot, but play primarily a kind of performative role in papers, papers that are kinda just summarizing the history of, of a particular concept or whatever. So this is, I think, very good. You can very quickly see here if you're missing very highly cited papers. Now, the guide goes into this in a lot more detail, and I don't really have time to do that here. So the easiest way to look through these data to see if you're maybe missing something important is to go to your publications window, which is your data again. And you can sort by the number of times cited. Now Pedersen and Feifel, which were the most highly cited before, are still the most highly cited in this network. Now we also see other papers here. We see Kosfeld. We see Bartz, Goldman, Keri, Domes that have all been cited a lot but were not retrieved. How do I know they've not be retrieved? Well, all non-retrieved papers, so these are just highly cited references that I've missed from my query, will not have a title. So I only have information about the name of the author, the the last name of the first author, the Journal of publication and the year. And this is because a formal reference tends to only have the last name of the first author. The journal of publication, the volume, the issue, and the page range, and the year of publication. So this is why. It has parsed out that data so that we can look at this. But I think this is useful. You will notice that most of these papers are the papers that we captured from the broader search that looked at intranasal oxytocin, regardless of whether there was a focus on schizophrenia. And, as I said in the previous talks, these are the papers that really made the case that intranasal oxytocin represented a promising treatment for a lot of social and mental disorders. So these are classic papers within the, the growing field of intranasal oxytocin research. But these are not, most of these at least, are not directly linked to schizophrenia research. Goldman is an exception here, because Goldman is. Now you can go through these very easily and look at whether the papers that are highly cited by the papers that you've captured via your query are actually relevant or not. Ok. So as I say read the step-by-step guide, I think it's, it's useful. If you do the clustering here. You'll find more clusters this time because it has more papers. So there's more information to know how to cluster your papers. And this is why analyzing a bigger network is better because it clusters papers together via their shared references as well, not just their direct citation links. So I like this. I like it. Okay. So before I go, I'm going to show you something ridiculous. Just because I like ridiculous, but I think it it'll kind of show you something interesting here. So here I'm going to put in to CitNetExplorer the Greater Prairie Chicken example, the intranasal oxytocin example, and schizophrenia and cannabis example. Just to show you what would happen if you put in very different literatures into this tool. Now I'm not going to include non-matching cited references here because this would be massive. There will be lots of references to unique papers because we wouldn't expect papers on the Greater Prairie Chicken to reference the same papers as the intranasal oxytocin schizophrenia papers. So the number of publications that will be brought into this network might be massive. So now, see, now we've got different groups of papers here. Now we can do this again, find connected components. And I want to find, I'll just use the default options just now. Now remember, so here we have the Greater Prairie Chicken literature. It's got a much longer publication history from the papers that I captured from the Web of Science. Here we have the intranasal oxytocin and schizophrenia literature. Here we have the cannabis and schizophrenia literature. And while we can't see a direct citation link between these groups, we will remember from the VOSviewer session that they did form a connected component, but it was through only one or two citation links. This is capturing the same thing. (this is why both are coloured blue) So just to show you what would happen. And I think this is a good demonstration of components for you. Now if you want to take a screenshot, this works in the same way as VOSviewer. You want to save it anywhere you want. And this will take a screenshot of what is exactly within your viewing window. So if you zoom in to this, it'll take that picture. So just to be careful. Okay, so I think that is enough. I hope I've put enough in here to capture your interest in CitNetExplorer. In the next session, I'm gonna be talking you through how to export the data from both CitNetExplorer and VOSviewer. And to analyze this in Excel, this is also important to know so that you can import the work that you've been doing in any session back into these tools. And it'll be the final session on these that if you decide you're suddenly very interested in network analysis, will prepare you nicely for the Gephi optional sessions. Right. I think that's enough. Well, thank you very much for listening. As always, if you need to get in contact, e-mail me at rleng@exseed.ed.ac.uk rleng@exseed.ed.ac.uk rleng@exseed.ed.ac.uk Thank you.