![]() Michael Wettengel |
by Dr Michael Wettengel
Archivist,
German
Federal Archives (Bundesarchiv)
Abstract After German unification, many former East German government agencies and
institutions were closed down.
Archivists had to secure not only their paper records, but also a
considerable number of machine-readable data holdings. Very often, however, the documentation of
these electronic records proved to be incomplete or even totally missing. In those cases, like the "Kaderdatenspeicher",
the database of files on party functionaries, different approaches were taken
to identify and verify data file structures and to reconstruct missing
documentation. The process that
led to German unification was rapid and spectacular. As nobody could foresee the dynamics of change and the sudden
collapse of the former German Democratic Republic (GDR, East Germany) that
caused the unification of the two German states, procedures to handle the
various problems of this period of transition had to be improvised. German
archivists were confronted with a situation without precedent as well. After 45 years of separation and different
institutional traditions, the former East German Central State Archives were
merged with the West German Federal Archives in October 1990. At
the same time as this reorganization took place, archivists had to face
considerable challenges. When,
suddenly, former East German government agencies and institutions were closed
down, not only their paper records, but also a considerable number of machine-readable
data holdings had to be secured or rescued from possible destruction. Whereas paper records were treated with
professional routine, concepts and procedures for the acquisition, appraisal,
description, and management of machine-readable records were lacking. The
new situation helped to bring about a change in German archivists’ attitudes
towards electronic records. Whereas
previously, little attention was paid to machine-readable material, the need to
take care of large quantities of East German data files revealed the necessity of
a stronger commitment in that field. The
Federal Archives decided to establish a section for machine-readable archives,
which became responsible for electronic records from former East German central
agencies and institutions as well as from federal government offices. Furthermore, this section was charged with
advising these federal offices on information management issues. The section was set up in August 1991 but
not provided with staff and basic technical equipment until summer 1993. By then, much precious time had already been
lost. The experiences with securing East German data files
showed that the creating organizations were not the best custodians of
machine-readable archives. Many data
files were no longer legible and data documentation was at least incomplete or
missing in most cases. Federal offices
only cared for these electronic records in so far as they could use them for
their purposes. However, these
experiences also showed that in a world where state and society are in constant
transition, it makes sense to have archivists engaged in electronic records
management and taking records of permanent value into their custody. In
the former GDR, machine-readable data holdings had been processed by centralized
mainframe systems in big data processing centres that belonged to the State and
received their commissions from government agencies and party
institutions. In most cases, they were
even institutionally affiliated with one or another of these agencies. Data processing centres throughout the East
German territory performed tasks and carried out orders from central government
agencies. Office
automation systems had been unknown in East Germany, and the first
applications for PCs with relatively small hard disks were not introduced in
East German government offices until the second half of the 1980s, shortly
before the collapse of the East German state.
Generally speaking, the GDR had yet to begin the introduction of
decentralized desktop personal computers and local server networks. With
the coming of formal unification in October 1990, East German state agencies
and institutions that were not taken over by federal offices or one of the
newly established federal states ("Länder") were either privatised or
dissolved. The same happened to many
data processing centres throughout the territory of the former GDR. Therefore, archivists who tried to take over
electronic records were confronted with varying situations, depending on what
happened to the respective data centres after unification. Archivists
had the easiest time working with data processing centres still in operation
and now operated by a federal government agency or a Länder. In such instances, sufficiently documented
data holdings could be acquired, and it was easy to obtain information from
operators and programmers. Very
often, however, data processing centres were in operation for only a short time
before they were closed. In these
cases, a process of decay in operation and organization was already underway
while the various centres were still in existence. Specialists from these centres tried to find new jobs elsewhere
and took with them both knowledge and the relevant manuals and data
documentation, which they regarded as their personal property. Typically, only the data carriers were left
to the archivists. The
situation was better in those cases where the data processing centre was closed
down immediately and the doors were locked.
Archivists had to enter sealed rooms, where they were confronted by huge
piles of paper records, printouts, manuals, card indices, floppy disks, tapes,
hard disk plates, and punch cards. But
as data processing centres in the former GDR were required to create and
maintain sufficient documentation on every project in at least three different
copies, chances were good to find enough context information along with the
data files. The
situation was much worse in those data processing centres that had been
privatised after unification. These newly
established private companies considered data holdings, which had been
processed for government agencies before 1990, to be part of their business
capital. They did not refrain from
selling former East German government data files. Even in cases whereby a company acknowledged that these data
files were now federal property, they nevertheless charged a tremendous fee for
the alleged preservation of the data. As
can be seen from these different examples, much depended on whether there was a
federal or state agency that took care of East German data files. In the case of the statistical data holdings
of the former GDR, these records have been secured by the Federal Office for Statistics (Statistisches
Bundesamt). The
former East German Central State Administration for Statistics (Staatliche
Zentralverwaltung für Statistik), that created these records, became a
branch of the Federal Office for Statistics, whereas the former Data Processing
Centre for Statistics (Datenverarbeitungszentrum Statistik) continued
operation until the end of 1992 under the Common Office for Statistics of the
New Länder (Gemeinsames Statistisches Amt der neuen Bundesländer). By the end of 1991, the Federal Archives and
the Federal Office for Statistics agreed on a formal co-operation in order to
secure East German statistical data files.
Even
if conditions for acquisition were good, as in the example of the statistical
records, this did not mean that archivists could easily take over the
files. Thus, for instance, legal
obstacles had to be overcome. The
Commissioner charged with the oversight and implementation of German privacy
legislation (Bundesbeauftragter für den Datenschutz) demanded that all
personal identifiers in East German data files should be deleted. In addition, the Federal Office for
Statistics claimed that statistical secrecy prevented the transfer of
statistical data files with single items of data to the Federal Archives. Despite these various problems, the Federal Archives
have been successful in acquiring East German data holdings without alterations
of the data in most cases.
Machine-readable records in the fields of statistics, economy,
agriculture, education, penal registration, and labour have been taken
over. The Federal Archives Law, which
was amended in 1990, provided the legal claim to take over East German
records. With the help of staff of the
former East German Central Archives in the newly established GDR-division of
the Federal Archives, appraisals and acquisitions of these records began. The GDR archivists provided much necessary
information for the description of the data files, information that proved to
be very important if the original documentation was missing. Data
processing systems in the territory of the former GDR proved to be not entirely
different from those in the Western world.
East German computer centres possessed mainframe systems for the
processing of large data compilations, as was common in Western countries about
twenty years earlier. East German data
holdings usually had hierarchical file structures that were not very
complicated. The hard- and software
used by East German data processing centres were copies of and variations on
Western models, naturally with different names. For instance, the so-called ESER-mainframe systems in East
Germany were copies of IBM-mainframes.
These facts, of course, greatly facilitated the work of archivists. As
storage media, primarily 9-track tapes were used. Many of them had only a density of 800 bpi. Owing to production problems, these tapes
bearing the East German trade marks ORWO or PYRAL proved to be in very poor
condition. Glue and abrasion had to be
removed from the tapes before they could be read. Sometimes, layers of the tape separated after the first reading
because of insufficient binder. In
order to secure the data, the tapes had to be copied as soon as possible. Although blocks or even whole tapes could
often no longer be read physically, there generally existed at least one backup
copy. Therefore, data losses could be
compensated for in many cases. Magnetic
hard disk plates had also been used as a storage medium. As a result of their uneven surface, those
plates sometimes damaged the reading heads.
Programs and job files were usually stored on tapes, on punch cards and
on 5.25 or 8-inch floppy disks. The
physical state of the data files depended on when the information was stored on
the tapes and on the storage conditions in the stack area. If these conditions were inappropriate, up
to 40% of the tapes could no longer be physically read after five years. The labelling of the tapes
followed the IBM scheme, with hardly any variation. Similar to Western IBM-mainframe applications, EBCDIC was used as
code. The Russian code DKOI (in the
former GDR also called “ESER Code”), which in translation means Binary Code for
Information Interchange, could also be found in East German data files. DKOI is very similar to EBCDIC and is
basically an enlargement of EBCDIC with a few variations and some extra
symbols: Hexadecimal 4A 4F 5A 5B 6A A1 C0 D0 E0 DKOI [ ! ] O | ¾ { } \ EBCDIC ‘ | ! $ (none) (none) (none) (none) (none) Binary-coded numerical
values, often used alternately with fields in EBCDIC representation, have been
typical features of East German data files, too. The frequent use of data compression techniques provided a
particular problem to archivists. The
record length was generally variable - a characteristic common to many Western
IBM-mainframe applications, as well.
However, the data fields in East German records were usually not
separated by delimiters. East German holdings had
been collected and processed for very specific purposes in the fields of
statistics, social and economic policy planning, personnel management,
distribution of goods, labour employment, and workforce distribution. Large data collections of statistical files,
goods and production files, and personnel files had been processed with the
help of Assembler or PL/1 programs, which are highly dependent on the mainframe
environment of the data processing centres.
Due to their sequential, hierarchical file structures, these
machine-readable records were archived as “flat files”, that is to say, as mere
sequential bit strings. In order to understand the
content of East German data files, it was of high importance to obtain complete
documentation. Archivists were not only
looking for program and data file documentation in a limited sense, but also
for the relevant context information on the “history” and the various purposes
of the data file. As a minimum requirement,
the Federal Archives ensured it could receive the data file structure, the
number of data sets, the data values, complete codebooks, compression
algorithms, and a list to identify the content of each tape. In spite of this general rule, it was decided
in rare instances to take over data files because of their informational value,
although not even this basic information could be obtained. One of these data files, the
so-called database of party functionaries or “Kaderdatenspeicher”, may
serve as an example. The Kaderdatenspeicher
contains personal data on 331,980 staff members (in 1989) of all former East
German government agencies, excluding those of the Ministry of State Security (Ministerium
für Staatssicherheit), the Ministry for National Defence (Ministerium
für Nationale Verteidigung), and the Ministry of the Interior (Ministerium
des Innern). These files not only
provide insight into the political and professional career of officials, but
also contain information on their parents.
There were several copies of
the Kaderdatenspeicher, of which the only one that still exists is the
one acquired by the Federal Archives.
At least in one case, there is sufficient evidence that one copy of the Kaderdatenspeicher
had been deliberately deleted shortly before the German unification in order to
protect cadre members. The considerable
value of this holding provided an incentive for the Federal Archives to invest
quite heavily into the reconstruction of its documentation. After first copying the
tapes of the Kaderdatenspeicher, the volume labels, the headers, and the
first blocks of data of each file were printed out. The volume labels and headers followed the IBM-scheme, so it was
easy to comprehend. From these data,
information on the content of each tape and an initial idea of the different
generations and applications of the Kaderdatenspeicher could be
obtained. However,
one typical problem already became apparent at this early stage: In the few lines of the volume label and headers, three different ways were used to express the date: Of
course, there are many more possibilities for expressing dates, especially considering the different ways of “packing” dates and numbers. There
is, for instance, a very common method of storing in only two bytes
any date from the 20th century: Nine
bits for the number of days in the year (0 to 511) and 7 bits for the number of
years (0 to 127), starting with 1900.
This way of expressing the date again leaves two options, starting
either with the days or years. Byte 1 Byte
2 either yyyy
yyyy dddd
dddd or dddd
dddd yyyy
yyyy There
is also the possibility of expressing a date by counting a
bit for every day (or whatever) since a system-dependent fixed date. These so-called “timer-tics” are extremely
difficult to decipher if the fixed date is not known. In East German data files, many different possibilities were used
to express dates or numbers. The
data sets of the Kaderdatenspeicher showed that only the full name, the Personal Identification Number (Personenkennziffer or PKZ), the address, and the agency were
written in plain EBCDIC. The
Personal Identification Number was
a unique number given to every citizen of the former GDR at birth. By this number, every East German citizen
could be identified. East Germans
carried this number with them in all official records throughout different life
situations, be it professional career or imprisonment. This Personal Identification Number was
also the key to a flourishing exchange of personal data between different East
German data processing centres, uninhibited by privacy legislation. ddmmyy
values = Date of birth Two digit numbers for each day/month/year s value
= Sex and century of birth. “2” = male born before 1900, “3” = female born before 1900, “4” = male born after 1900, “5” = female born after 1900. cccc
values = Location code. For individuals born before 1970, place of
residence. PKZ used for individuals born after 1970
birthplace. x value
= System control digit All
the other data fields were coded by numerical values, represented as binary
figures. The record length of the Kaderdatenspeicher
is variable. Binary codes and packing methods had been quite
common in East German data files, and the methods used often varied. Fortunately, no further compression
algorithms had been used in the case of the Kaderdatenspeicher. The Kaderdatenspeicher
had been processed by the help of Assembler programs. It became clear that
without a precise description of the data file structure, there was no way to understand
the meaning of the data. Therefore,
as much information on the Kaderdatenspeicher
as possible was needed. The orders and commissions to create and process the Kaderdatenspeicher came from the Council of
Ministers (Ministerrat der DDR).
The vertical files of this office had been added to the collections of the Federal
Archives in Potsdam after unification. After searching these holdings for
references to the Kaderdatenspeicher-project, a
series of records that contains descriptions of the Kaderdatenspeicher
and reports from the data processing centre with a lot of
substantial information could be found. These
paper records provided information on the content, purpose, history, and
development of the Kaderdatenspeicher project,
in particular: The
reports to the Council of Ministers also
contained information on the data file structure and codebooks. The Kaderdatenspeicher
consists of annual compilations, so-called “generations” of data files for the
year 1980 and for each year from 1985 to 1989 as well as of extracts for
various purposes. Almost all of these
data files have at least a slightly different structure. Nevertheless, the data file structures of
all generations of the Kaderdatenspeicher
could be found. Much information could
be inferred from so-called “address tables” (Adressentabellen), which
represent the record layout of a specific file. In some instances, the
content of data fields could also be concluded from the formulas for the
collection of the data, of which specimens were found in the records. Of course, comparing the items in the
formulas with the content of the data fields was only possible if the data
items were not expressed in binary figures. The
data flow between East German data processing centres mentioned above proved to
be another source of information in the effort to reconstruct lost
documentation. This exchange of large
quantities of coded data could only operate on the basis of shared
codebooks. In fact, the codes used in
the big East German personal-related data holdings have been relatively stable
and were often the same. Diagrams could
be found in the records, where the codes of different data holdings were
compared. What was meant to be a tool
to facilitate data exchange is now a guide for archivists to find out which
codes of data fields in different data holdings are the same. The
data files of the Kaderdatenspeicher were
closely linked with the so-called staff databases of ministries and separate
government branches (Arbeitskräftedatenspeicher),
the data base containing
personal data of staff members. All the
data of the Kaderdatenspeicher were
originally collected from these staff databases. The Federal Archives has been successful in acquiring a
relatively comprehensive and complete documentation of the staff database of
the Ministry of Public Education (Ministerium für Volksbildung). Therefore, additional information on the
record layout and the data file structure of the Kaderdatenspeicher could be derived from the
documentation of the staff database of this ministry. However,
many questions remained open. Even if
the data file structure of a record, the address, length, and content of a
specific field is known, it may still not be understood. To take the simple example from above, there
are many ways to express a date and the one used may not be known. In these cases, specific software is used to
analyse sequential files. In
order to obtain background information, archivists have also made contacts with
former employees of East German data processing centres who had created or
worked with the data holdings that were acquired. In rare and difficult cases, for instance, when compression
algorithms were used which could not be deciphered, programmers from former
East German data processing centres were even hired as consultants. 5.
Access for researchers As it
has been pointed out, different approaches had to be taken in order to identify
and verify data file structures and to reconstruct documentation: In
this way, much of the missing documentation of East German data holdings could
be reconstructed. However, although a
number of fairly well-documented data files can already be presented for
research purposes, most East German data holdings still remain a problem
because of the specific hardware-background in which they were created. Since the main goal of reconstructing
documentation is to facilitate access to the data, additional efforts are
necessary. For the long term
preservation, East German data files are stored as flat files. Apart from this “archival copy”, the Federal
Archives are planning to create “research copies” with specific formats that
are better suited for research purposes and easier to handle. These “research copies” are not meant for
archival preservation. The Federal
Archives has made an agreement with the Centre for
Historical Social Research (Zentrum für Historische Sozialforschung)
and the Data Archives for
Social Research (Zentralarchiv für empirische Sozialforschung) in
order to use the expertise and the technical facilities of these institutions
to create research files of East German machine-readable records. The aim of this co-operation is to promote
historical social research on the former GDR.
Taking
over East German data holdings has certainly been an extreme experience from
which it is difficult to generalise.
However, some of the attitudes and procedures in East German computer
centres are probably universal. For
instance, it seems that people working with computers love to play around with
programs and data but are not particularly fond of documenting what they are
doing. A lot of what is important for
future archivists and researchers of data holdings will always be in private
notebooks or in the memories of system administrators and records
creators. However, preserving these
archival holdings means ensuring their accessibility in the future, and
reconstructing documentation may be one of the keys to it. Bibliography Bikson, T. K. and Frinking,
E. J.
(1993), Preserving the Present.
Toward Viable Electronic Records (The Hague). Buchmann, W. (1989), ‘Archive und die elektronische
Datenverarbeitung. Ein
Diskussionsbeitrag zu den Folgen der Einführung einer neuen Technologie für die
Archive’ (Archives and electronic data
processing. A discussion paper on the
effects of the introduction of a new technology for archives), in
Kahlenberg, F. P. (1989) (ed.), Aus der Arbeit der Archive (from the work of Archives), Boppard,
243-256. Kahlenberg, F. P. (1992), ‘Democracy
and Federalism: Changes in the National Archival System in a United Germany’, American
Archivist, 55: 72-83. Angelika
Menne-Haritz (1993) (ed.): Information Handling in Offices and
Archives (München, London, New York, Paris). Mühlbauer, H. (1995), Kontinuitäten und Brüche in der
Entwicklung des deutschen Einwohnermeldewesens: Historisch-juristische
Untersuchung am Beispiel Berlins (Continuity
and caesura in the development of the German inhabitant registration
system. Historical-legal research on Berlin)
(Frankfurt a M., Berlin, Bern, New York, Paris, Vienna). Trugenberger,
V. (1994), ‘EDV in deutschen
Archiven - eine Zwischenbilanz’ (Electronic
data processing in German archives – interim results), ABI-Technik
14, 4: 283-298. Wettengel,
M. (1993a), ‘Zum Stand der
Archivierung maschinenlesbarer Daten im Bundesarchiv’(Archiving of machine-readable data holdings in the Federal Archives),
Mitteilungen aus dem Bundesarchiv (Proceedings from the Federal Archives) 1, 1: 21-23. - (1993b),
‘System zur Archivierung maschinenlesbarer Daten im Bundesarchiv’ (Systems for
Archiving machine-readable data in the Federal Archives), Mitteilungen aus
dem Bundesarchiv, 1, 2: 70-72. - and Rathje,
U. (1994): ‘Datenspeicher
Gesellschaftliches Arbeitsvermögen der DDR’ (Database “Social Workforce” of East Germany), Mitteilungen aus
dem Bundesarchiv, 2, 3: 157-159. Paper first
published: History and Electronic
Artefacts, Edward
Higgs (Ed.), Oxford University Presds, 1998, pp. 265-276., and originally
presented to the Annual Meeting of the Society of American Archivists,
Washington, D.C., Sept. 2, 1995. For further reading on German archives and records management, see Development and Traditions of Records Management and Archives in Germany,1. Introduction
2. Conditions of acquisition
"Applications for PCs with small hard disks were not introduced in East German government offices until shortly
before the collapse of the GDR."
Michael Wettengel
Statistisches Bundesamt3. Media, record structures and
codes
4. Reconstructing documentation:
The Kaderdatenspeicher

Ministerium für Staatssicherheit
Example
Structure of the Personal Identification Number
“ddmmyy s cccc x”
Disposition and Archiving of Authentic Electronic Records in the new Germany's "Information Network Berlin-Bonn",by Drs Andreas Engel and Michael Wettengel.
by Dr Nils Brübach.
Michael Wettengel (born in
1957) is an historian. Since 1989, he
has worked as an archivist at the Federal
Archives (Bundesarchiv), Germany.
In 1991, he became the head of the newly established machine-readable
archives section of the Federal Archives, and since 1999 he has been
responsible for government records and electronic records management. He also gives professional training courses
for archivists on electronic records at the Federal Archives and at the Archives Institute (Archivschule)
in Marburg.
![]() Bundesministerium des Innern |
He is a member of the board of Quantum (Association for Quantification
and Methods in Historical and Social Research). He participates in projects, committees and working groups
concerning electronic records management and information technology in public administration,
such as the pilot project on Document Management and Electronic Archiving (DOMEA) in the Federal Ministry of the Interior (Bundesministerium
des Innern, BMI). He is a
member of the Committee
on Electronic and Other Current Records of the International Council on
Archives (ICA), of the DLM (Donnèes lisibles par machine - Electronic
Records) Monitoring Committee of the European Commission in Brussels, and of
Sub-committee 11 “Records Management” of ISO/TC 46. He chairs the sub-committee NABD/AA 15 on records management of
the German Institute for Standardisation (DIN).
To go to the RIMOS home page 
To go to The Caldeson Consultancy main index page 