sesam: Ensuring Privacy for a Interdisciplinary Longitudinal Study
Am "Workshop Elektronische Datentreuhänderschaft - Anwendungen, Verfahren, Grundlagen" anlässlich der Informatik 2006 in Dresden präsentierten die für die Datenbank von sesam verantwortlichen Boris Glavic und Klaus Dittrich am 5. Oktober 2006 ihr Paper mit dem Titel:
sesam: Ensuring Privacy for a Interdisciplinary Longitudinal Study
Abstract: Most medical, biological and social studies face the problem of storing information about subjects for research purposes without violating the subject’s privacy. In most cases it is not possible to remove all information that could be linked to a subject, because some of this information is needed for the research itself. This fact holds especially for longitudinal studies, which collect data about a subject at different times and places. Longitudinal studies need to link different data about a specific subject, collected at different times for research and administration use. In this paper we present the security concept proposed for sesam, a longitudinal interdisciplinary study that analyses the social, biological and psychological risk factors for the development of psychological diseases. Our security concept is based on pseudonymisation, encrypted data transfer and an electronic data custodianship. This paper is mainly a case study and some of the security problems emerged in the context of sesam may not occur in other studies. Nevertheless we believe that an adopted version of our approach could be used in other application scenarios as well.
Der ganze Artikel als .pdf am Originalstandort oder allenfalls auch hier.
Ein paar relevante Zitate aus dem 7seitigen Artikel:
Because of the need to link subjects and scientific data even anonymisation without quality reduction is not applicable for sesam. Considering these constraints, protecting the subject’s privacy is limited to pseudonymisation of scientific data and protecting the data and mapping between subjects and data from unauthorised access.
(...)
We use pseudonyms called subject identifiers or SIDs to identify the subjects about which scientific data was collected. All personal information like name or address is stored associated with another pseudonym called subject study number or SSN. The mapping between SID and SSN is not stored in sesamDB. We establish an electronic data custodian to control the access to the mapping between the SSNs and the SIDs.
The mapping information is stored in a second database located at an external location and administrated by an external organisation. This external database, called mapDB, is connected to sesamDB via a private connection. sesam-employees have no direct access to mapDB and can only access the mapping information using a sesam client application. These client applications authenticate users and restrict the access to the mapping information to specific use cases.
(...)
sesamDB will be backed up to a second server on a daily basis. This second server is placed in the same location with the sesamDB server. In addition tape backups will be performed every week and the tapes will be stored in a secured location outside the central site.
(...)
Access to data stored in the sesamDB is restricted to computers located at the sesam central site. These computers are connected to sesamDB via a local network connection. We require that no computer that is connected to sesamDB is connected to the Internet. The access to sesamDB is restricted to specialised client applications, which have only access to the data needed for their field of activity. For example the client application used for data export and scientific analysis has access rights for all scientific data, but no access rights for personal subject information and mapDB.
The client application used for data export logs all data export queries and stores the log information in sesamDB. The log information allows us to monitor the data exports and analyse the exports executed by a specific person. Data is made availible to third parties in aggregated form, without SIDs and with assent of the study direction.
sesam: Ensuring Privacy for a Interdisciplinary Longitudinal Study
Abstract: Most medical, biological and social studies face the problem of storing information about subjects for research purposes without violating the subject’s privacy. In most cases it is not possible to remove all information that could be linked to a subject, because some of this information is needed for the research itself. This fact holds especially for longitudinal studies, which collect data about a subject at different times and places. Longitudinal studies need to link different data about a specific subject, collected at different times for research and administration use. In this paper we present the security concept proposed for sesam, a longitudinal interdisciplinary study that analyses the social, biological and psychological risk factors for the development of psychological diseases. Our security concept is based on pseudonymisation, encrypted data transfer and an electronic data custodianship. This paper is mainly a case study and some of the security problems emerged in the context of sesam may not occur in other studies. Nevertheless we believe that an adopted version of our approach could be used in other application scenarios as well.
Der ganze Artikel als .pdf am Originalstandort oder allenfalls auch hier.
Ein paar relevante Zitate aus dem 7seitigen Artikel:
Because of the need to link subjects and scientific data even anonymisation without quality reduction is not applicable for sesam. Considering these constraints, protecting the subject’s privacy is limited to pseudonymisation of scientific data and protecting the data and mapping between subjects and data from unauthorised access.
(...)
We use pseudonyms called subject identifiers or SIDs to identify the subjects about which scientific data was collected. All personal information like name or address is stored associated with another pseudonym called subject study number or SSN. The mapping between SID and SSN is not stored in sesamDB. We establish an electronic data custodian to control the access to the mapping between the SSNs and the SIDs.
The mapping information is stored in a second database located at an external location and administrated by an external organisation. This external database, called mapDB, is connected to sesamDB via a private connection. sesam-employees have no direct access to mapDB and can only access the mapping information using a sesam client application. These client applications authenticate users and restrict the access to the mapping information to specific use cases.
(...)
sesamDB will be backed up to a second server on a daily basis. This second server is placed in the same location with the sesamDB server. In addition tape backups will be performed every week and the tapes will be stored in a secured location outside the central site.
(...)
Access to data stored in the sesamDB is restricted to computers located at the sesam central site. These computers are connected to sesamDB via a local network connection. We require that no computer that is connected to sesamDB is connected to the Internet. The access to sesamDB is restricted to specialised client applications, which have only access to the data needed for their field of activity. For example the client application used for data export and scientific analysis has access rights for all scientific data, but no access rights for personal subject information and mapDB.
The client application used for data export logs all data export queries and stores the log information in sesamDB. The log information allows us to monitor the data exports and analyse the exports executed by a specific person. Data is made availible to third parties in aggregated form, without SIDs and with assent of the study direction.
patpatpat - 28. Okt, 12:46