GESIS Guides

A Title for this Guide [please capitalize]

A Subtitle for this Guide

    • Barabási, A.-L. (2012). Luck or reason. Nature, 489(7417), 507–508. https://doi.org/10.1038/nature11486

    • Bourdieu, P. (1990). The Logic of Practice. Polity Press.

    • DiTomaso, N. (1982). “Sociological Reductionism” From Parsons to Althusser: Linking Action and Structure in Social Theory. American Sociological Review, 47(1), 14–28.

    • Gershenson, C., & Heylighen, F. (2003). When Can We Call a System Self-Organizing? In G. Goos, J. Hartmanis, J. van Leeuwen, W. Banzhaf, J. Ziegler, T. Christaller, P. Dittrich, & J. T. Kim (Eds.), Advances in Artificial Life (Vol. 2801, pp. 606–614). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-540-39432-7_65

    • Hofman, J. M., Watts, D. J., Athey, S., Garip, F., Griffiths, T. L., Kleinberg, J., Margetts, H., Mullainathan, S., Salganik, M. J., Vazire, S., Vespignani, A., & Yarkoni, T. (2021). Integrating explanation and prediction in computational social science. Nature, 595(7866), 181–188. https://doi.org/10.1038/s41586-021-03659-0

    • Lindgren, A. (1945). Pippi Långstrump. Rabén & Sjögren.

    • Merton, R. K. (1948). The Self-Fulfilling Prophecy. The Antioch Review, 8(2), 193–210. https://doi.org/10.2307/4609267

    • Soundarajan, S., Tamersoy, A., Khalil, E. B., Eliassi-Rad, T., Chau, D. H., Gallagher, B., & Roundy, K. (2016). Generating Graph Snapshots from Streaming Edge Data. Proceedings of the 25th International Conference Companion on World Wide Web - WWW '16 Companion, 109–110. https://doi.org/10.1145/2872518.2889398

    • Wasserman, S., & Faust, K. (1994). Social Network Analysis: Methods and Applications. Cambridge University Press.

    • Watts, D. J. (2003). Six Degrees: The Science of a Connected Age. Norton.

    • White, H. C. (1992). Identity and Control: A Structural Theory of Social Action. Princeton University Press.

Publication date
Month Day, 202Y; Version X.X
Keywords
keyword, keyword, keyword, keyword, another keyword
DOI
10.60762/ggdbdxxxxx
Suggested citation
Author, N. (202Y). Title of this Guide (GESIS Guides to Digital Behavioral Data, X). Cologne: GESIS – Leibniz Institute for the Social Sciences. doi.org10.60762/ggdbdxxxxx
License
Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
@misc{https://doi.org/10.60762/ggdbd25001.1.0,
  doi = {10.60762/GGDBD25001.1.0},
  url = {https://www.gesis.org/fileadmin/admin/Dateikatalog/pdf/guides/01_Wagner_Stier_Zens_Digital_Behavioral_Data.pdf},
  author = {Wagner, Claudia and Stier, Sebastian and Zens, Maria},
  keywords = {digital behavioral dta, computational social science, digital traces, digital societies, platform data, algorithmic behavior, socio-technical systems},
  language = {en},
  title = {What is Digital Behavioral Data? (GESIS Guides to Digital Behavioral Data, 1)},
  publisher = {GESIS - Leibniz Institute for the Social Sciences},
  year = {2025},
  copyright = {Creative Commons Attribution Non Commercial 4.0 International}
}

Author One1,2, Author Two2

1 Affiliation One

2 Affiliation Two

Abstract

An abstract goes here … Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut.

Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lo-rem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat.

1 First-level heading [headings are not capitalized]

[| Paragraph] This is a standard paragraph.

The standard font for this publication is “Source Sans Pro”. All fonts are embedded in the file. Hyperlinks like www.gesis.org are automatically formatted in the “Hyperlink” format.

Please use American English. Please use APA style for your references; do not use footnotes or endnotes. Please use Oxford commas in conjunctions. Please use en dashes – with spacing – for parentheses. For emphasizing text please use bold [“| Bold” font style format]. Please use typographical quotation marks, apostrophes, or high commas (“quote with a ‘quote’ within”) everywhere (except in code sections). Word generates them automatically when typing. Please note the difference between typographical and non-typographical quotation marks: "non-typographical quote with a 'non-typographical quote' within". The latter can happen when text is pasted into the manuscript.

1.1 Second-level heading

Word will automatically keep track of the numbering of headings and subheadings.

Please do not use more than two levels of headings.

Inline heading to structure text. To add a third-level structure to your manuscript, use the “| Heading Inline” style.

Sometimes we need paragraphs with extra space between paragraphs (e.g., to create logical text breaks or to separate boxes/figures/tables/quotes from text paragraphs). To create this extra space, insert a “| Divider” paragraph. The following divider has text for demonstration purposes, otherwise it must be without text:

[| Divider]

2 Formatting Expert Insights

2.1 Questions and answers

GESIS: What are you working on now, particularly in the context of network analysis and centrality measures?

Author name: At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum.

2.2 Core quotes

Core sentences can be highlighted. Dividers are placed before and after quotes:

This is a quote taken from the text that illustrates a central idea or concept.

Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat.

3 Special features

3.1 Lists and numbered lists

This is a list, using the “| List” style:

  • Apples

  • Oranges

  • Bananas

Numbered list, using the „| List Numbered“ style:

  1. Apples

  2. Oranges

  3. Bananas

This is a list of checkboxes, using the “| List Checkboxes” style:

  • Apples

  • Oranges

  • Bananas

Hyperlinks like www.gesis.org are automatically formatted in the “Hyperlink” format.

3.3 Citation

Please use APA style. This is a citation of multiple references using the Zotero citation manager. Example citations (Hofman et al., 2021; Lindgren, 1945; Merton, 1948; Soundarajan et al., 2016; White, 1992). It is also possible to just cite a year when the author name Watts (2003) is used in the text. A page can be cited (DiTomaso, 1982, p. 12 ) –make sure the standard language in Word is English so “page” is abbreviated with a “p” –, also without the name of the author Gershenson and Heylighen (2003, p. 12 ). It is possible to use a prefix (cf. Barabási, 2012), a suffix (Bourdieu, 1990, and references therein), or both (cf. Wasserman & Faust, 1994, and similar stuff). More citations are added below to test jumping.

3.4 Endnotes

To insert an endnote, (1) use the “Insert endnote” button in the “References” menu and (2) add “[…]” brackets around it. There are endnotes here [1] and here [2].

The endnotes will be in the “Endnotes” section before the reference list.

4.1 Mathematics

The Pythagorean theorem \(a^{2} + b^{2} = c^{2}\) is a fundamental relation in Euclidean geometry between the three sides of a right triangle. Sometimes, one only mentions a single mathematical variable \(a\) in the text. And sometimes a mathematical equation takes its own row:

\[P(k) \sim k^{- \alpha}\]

The paragraph continues here.

4.2 Code

There are two ways to handle code. As a display block like

if true:
    print("Hello")

and as an inline block [| Code Inline] like pip install numpy. This can also be used to highlight package names like numpy.

Note that typographical quotation marks are not possible – and should not be possible – in code blocks.

Snippets of code must use one of the following styles:

  • | Code

  • | Code Inline

  • | Code Python

  • | Code R

The styles that include the name of a programming language will format the code with colors (like in an IDE). The colors aren’t visible in Word, only in HTML when generated from this manuscript. This is Python code:

result, candidates = [], [self]
while candidates:
    node = candidates.pop()
    distance = node._get_dist(obj)
    if distance <= max_dist and distance >= min_dist:
        result.extend(node._values)
    candidates.extend(node._get_child_candidates(distance, min_dist, max_dist))
return result

This is R code:

set.seed(42) ## for sake of reproducibility
n <- 6
dat <- data.frame(id=1:n, 
                  date=seq.Date(as.Date("2020-12-26"), as.Date("2020-12-31"), "day"),
                  group=rep(LETTERS[1:2], n/2),
                  age=sample(18:30, n, replace=TRUE),
                  type=factor(paste("type", 1:n)),
                  x=rnorm(n))

We can also create code output. For this, use the “| Code Output” style:

This is code input
This is code output as a block
With two rows

In addition, there is a “| Code Output Inline” format for inline code output, for example, to write that a command creates a MemoryOverflow! error.

4.3 Tables

Small and simple tables. Table 1 is a small (that fits in a single A4 portrait page) and simple (only column labels) table. To build it, (1) create a table in Word, (2) make sure the “Table Style Options”/“Tabellenformatoptionen” have only “Header Row”/“Kopfzeile” enabled, (3) apply the “| Table” or “| Table Small Font” or “| Table Narrow” style, (4) apply the “Guides Table Standard” style in the table layout menu, (5) manually adjust column width if you wish, and (6) change the font style in the first row to “Source Sans Pro”.

Table 1. Example of a standard table.

Name Age Discipline Employee at GESIS
Peter 34 Social science
Mary 45 Computer science
Bob 32 Physics

References to tables are capitalized and in semibold. For example, this is a reference to Table 1.

Contingency tables. Table 2 is a contingency table. To build it, (1) create a table in Word, (2) make sure the “Table Style Options”/“Tabellenformatoptionen” have only “Header Row”/“Kopfzeile” and “First column”/“Erste Spalte” enabled, (3) apply the “| Table” or “| Table Small Font” or “| Table Narrow” style, (4) apply the “Guides Table Contingency” style in the table layout menu, (5) manually adjust column width if you wish, (6) make the top-left cell background is white, and (7) make the font in the first column dark blue (top row in the font color from the palette).

Table 2. Example from Word of a 2x2 contingency table.

First column category Second column category
First row category something here and here
Second row category more here Items here ‒ one ‒ two

Big, sortable, and searchable tables. Table 3 is so big that it could be placed on a landscape page, and it has a very long table caption. The example uses the “| Table Narrow” style. To make such tables sortable and searchable in HTML, apply the “Guides Table Sortable” style in the table layout menu.

Table 3. Dataset overview. (Dataset) Name of the dataset, if no name is available, the model name is used. (Open source) Training data is available (✔) or unavailable (🗶) for researchers; * a subset is publicly available. (Size) Size of the dataset in byte or in tokens if byte is not available. (Cut-off date) The date of the newest information in the training data; * for the latest version out of the model family; ** Llama-3 8B has an earlier cutoff date: March 2023. (License) License of the Dataset. (Resources) Paper with technical details of the dataset, if not available a blog post; * for the GPT-3 dataset more information is available. It is assumed that the GPT-3.5 has comparable characteristics. (Construction) Information on the sources of the training data.

Dataset Open source Size Cut-off Date License Resources Construction
GPT-4 dataset 🗶 Dec, 2023* Proprietary No released technical details
GPT-3.5 dataset 🗶 Sep, 2021* Proprietary GPT-3* Paper GPT-3: CommonCrawl, WebText, English Wikipedia, and two books corpora (Books1 and Books2)
Qwen1.5 dataset 🗶 Proprietary
Mistral dataset 🗶 Proprietary
The Pile 825 GB Dec, 2020 MIT License Paper Datasheet: https://arxiv.org/abs/2201.07311
Llama-3 dataset 🗶 15T tokens Dec, 2023** Proprietary Blog post extension of Llama-2 dataset, 5% non-english with 30 languages in total
Llama-2 dataset 🗶 Sep, 2022 Proprietary Paper CommonCrawl, C4, Github, Wikipedia, Books, ArXiv, StackExchange
Falcon RefinedWeb (✔)* 2.8TB Feb, 2023 ODC-By 1.0 Paper CommonCrawl + a curated corpora
Phi-2 dataset 🗶 250B tokens Proprietary Paper combination of NLP synthetic data created by AOAI GPT-3.5 and filtered web data from Falcon RefinedWeb and SlimPajama, which was assessed by AOAI GPT-4

Simple tables and contingency tables are not sortable and searchable in HTML by default.

4.4 Figures, graphics, images

Insert figures. The preferred formats for inserted figures are SVG, PNG, and JPEG. All figures must have a caption. Captions must have the style “Figure Caption”. Figure 1 is an example of an SVG figure and Figure 2 is an example for a PNG figure.

For setting the width of a figure in HTML, there are three styles. Figures styled with “| Figure 100” will be as wide as the HTML page, figures styled with “| Figure 50” will be half as wide as the HTML page, and figures styled with “| Figure 33” will be a third as wide as the HTML page. The following is a “| Figure 100”:

Five plots.

Figure 1. Figure 100% wide

The following is a “| Figure 50”:

Diagram showing feedback loop relationship of pattern and transactions.

Figure 2. Figure 50% wide

And we repeat the same figure as a “| Figure 33”:

Diagram showing feedback loop relationship of pattern and transactions.

Figure 3. Figure 33% wide

Here in Word, adjust figure size by hand. As a rule of thumb, fonts in figures should be as large as in the text.

Word art. You can craft figures using Word art. The editors will help with styling.

4.5 Boxes

To highlight summaries, lessons learned, or something along those lines, use boxes. To create a box, (1) choose one of the colored boxes below and copy it where you want it, (2) the paragraph containing the box must be a standard “| Paragraph”, and (3) make sure the text inside the box (not the box itself) is formatted using one of the “| Box Blue/ Pink/ Bubble/ Turquoise” styles. The box is as wide as the page and automatically adjusts its height.

Blue Box

Text here

Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.

Pink Box

Text here

Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.

Berry Box

Text here

Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.

Turquoise Box

Text here

Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat.

5 File metadata

In order for the pipeline to work as intended, metadata must be set for the DOCX file. Navigate to File > Info > Properties > Advanced properties (you must click on the pull-down menu of properties to get to the advanced properties). In the window that opens, click on the right folder where you can set parameter values. Set the guide title as the value of the title parameter, and set the DOI (without the “http://doi.org/” part) as the value of the doi parameter. If the guide is an Expert Insights interview, change the value of the DocumentType parameter to “interview” (without quotation marks).

Acknowledgements [optional]

Notes [optional]

Put here additional information. For example: All links in the text and the reference list were retrieved on Month DD, 202Y.

Version note [optional]

Put here information on the version of the guide.

Declaration of AI [optional]

Lorem ipsum.

About the author / About the authors

Author A is a sociologist. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.

Author B is a computer scientist. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.


  1. First endnote

    ↩︎
  2. Second endnote

    ↩︎