Peter Buneman

University of Edinburgh

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Trends

Author Order

Document Type

Co-Authors

Susan B. Davidson

University of Pennsylvania

Wenfei Fan

University of Edinburgh

Atsushi Ohori

Tohoku University

Wang-Chiew Tan

Alpha Omega Alpha Medical Honor Society

James Cheney

University of Edinburgh

Scott Weinstein

California University of Pennsylvania

Dan Suciu

University of Washington

Val Tannen

University of Pennsylvania

Sanjeev Khanna

University of Pennsylvania

Val Breazu-Tannen

University of Pennsylvania

Cooperative Institutions

Penn Center for AIDS Research

211

Association for Computing Machinery

195

University of Utah

University of Illinois Urbana-Champaign

University of Edinburgh

University of Waterloo

Huawei Technologies (United States)

University of Pennsylvania

Dartmouth College

Institut national de recherche en informatique et en automatique

Author Statistics

Papers

Citation

H-Index

i-10 index

Research Field

Keys for XML

Peter Buneman Susan B. Davidson Wenfei Fan Carmem S. Hara Wang-Chiew Tan

Article Share on Keys for XML Authors: Peter Buneman University of Pennsylvania University of PennsylvaniaView Profile , Susan Davidson University of Pennsylvania University of PennsylvaniaView Profile , Wenfei Fan Temple University Temple UniversityView Profile , Carmem Hara Universidade Federal do Parana, Brazil Universidade Federal do Parana, BrazilView Profile , Wang-Chiew Tan University of Pennsylvania University of PennsylvaniaView Profile Authors Info & Claims WWW '01: Proceedings of the 10th international conference on World Wide WebMay 2001 Pages 201–210https://doi.org/10.1145/371920.371984Online:01 April 2001Publication History 121citation892DownloadsMetricsTotal Citations121Total Downloads892Last 12 Months7Last 6 weeks0 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteGet Access

10.1145/371920.371984

Cite

Citations (345)

A Provenance Model for Manually Curated Data

Lecture notes in computer science (2006)

Peter Buneman Adriane Chapman James Cheney Stijn Vansummeren

Copying

Data model (GIS)

10.1007/11890850_17

Cite

Citations (45)

Workshop on Database Programming Languages

François Bancilhon Peter Buneman

Source

Cite

Citations (48)

RDF graph alignment with bisimulation

Proceedings of the VLDB Endowment (2016)

Peter Buneman Sławek Staworko

We investigate the problem of aligning two RDF databases, an essential problem in understanding the evolution of ontologies. Our approaches address three fundamental challenges: 1) the use of "blank" (null) names, 2) ontology changes in which different names are used to identify the same entity, and 3) small changes in the data values as well as small changes in the graph structure of the RDF database. We propose approaches inspired by the classical notion of graph bisimulation and extend them to capture the natural metrics of edit distance on the data values and the graph structure. We evaluate our methods on three evolving curated data sets. Overall, our results show that the proposed methods perform well and are scalable.

Bisimulation

RDF Schema

Blank

Linked Data

10.14778/2994509.2994531

Cite

Citations (17)

Archiving scientific data

ACM Transactions on Database Systems (2004)

Peter Buneman Sanjeev Khanna Keishi Tajima Wang-Chiew Tan

Archiving is important for scientific data, where it is necessary to record all past versions of a database in order to verify findings based upon a specific version. Much scientific data is held in a hierachical format and has a key structure that provides a canonical identification for each element of the hierarchy. In this article, we exploit these properties to develop an archiving technique that is both efficient in its use of space and preserves the continuity of elements through versions of the database, something that is not provided by traditional minimum-edit-distance diff approaches. The approach also uses timestamps. All versions of the data are merged into one hierarchy where an element appearing in multiple versions is stored only once along with a timestamp. By identifying the semantic continuity of elements and merging them into one data structure, our technique is capable of providing meaningful change descriptions, the archive allows us to easily answer certain temporal queries such as retrieval of any specific version from the archive and finding the history of an element. This is in contrast with approaches that store a sequence of deltas where such operations may require undoing a large number of changes or significant reasoning with the deltas. A suite of experiments also demonstrates that our archive does not incur any significant space overhead when contrasted with diff approaches. Another useful property of our approach is that we use XML format to represent hierarchical data and the resulting archive is also in XML. Hence, XML tools can be directly applied on our archive. In particular, we apply an XML compressor on our archive, and our experiments show that our compressed archive outperforms compressed diff-based repositories in space efficiency. We also show how we can extend our archiving tool to an external memory archiver for higher scalability and describe various index structures that can further improve the efficiency of some temporal queries on our archive.

10.1145/974750.974752

Cite

Citations (136)

A type system that reconciles classes and extents

Database Programming Languages (1992)

Peter Buneman Atsushi Ohori

We present a type system that naturally couples two different, and apparently contradictory, notions of inheritance that occur in object-oriented databases. To do this we distinguish between the type and a kind of a value: A type describes the entire structure of a value, while a kind describes only the availability of certain fields or methods. This distinction allows us to manipulate heterogeneous collections (collections of values with differing types) in a statically type-checked language. Moreover, the type system is polymorphic and types may be inferred using an extension of the technique used in ML. This means that it is easy to express general-purpose operations for the manipulation of heterogeneous collections. We believe that this system not only provides a natural approach to static type-checking in object-oriented databases; it also offers a technique for dealing with external databases in a statically typed language.

Inheritance

Type safety

Data type

Abstract data type

Value (mathematics)

Source

Cite

Citations (12)

Mediator languages—a proposal for a standard

ACM SIGMOD Record (1997)

Peter Buneman Louiqa Raschid Jeffrey D. Ullman

The DARPA Intelligent Integration of Information (I 3 ) effort is based on the assumption that systems can easily exchange data. However, as a consequence of the rapid development of research, and prototype implementations, in this area, the initial outcome of this program appears to have been to produce a new set of systems. While they can perform certain advanced information integration tasks, they cannot easily communicate with each other.With a view to understanding and solving this problem, there was a group discussion at the DARPA Intelligent Integration of Information/Persistent Object Bases (I 3 /POB) meeting in San Diego, in January, 1996; and a further workshop was held on this topic at the University of Maryland in April, 1996. The list of participants is in Appendix A. The idea emerging from these meeting a was not to force all systems to communicate according to specified standards, but to agree on the following:• A minimal core language, or Level 1 option, which would be a restriction of the object-oriented query language OQL, such that it will accept queries for relational databases. We recommend that all system components be able, at a minimum, to accept queries in this syntax, provided they address concepts (e.g., relations or classes, attributes or instance variables) known to that component. There must be a simple protocol to determine the schema of a system (its set of supported concepts).• A simple format for representing answers. This could also be a fragment of OQL and will be included in the core language specification.• A set of extensions, one of which could be full OQL, and would handle complex structures and abstract types (with methods). Other extensions will be needed to support rules (e.g., definitions of terms that can be shared among components), semistructured data (for self-describing objects), and shared code. A system component could support one or more of these extensions, independently, and there should be some simple protocol to determine the particular extensions that are supported.

Schema (genetic algorithms)

Implementation

10.1145/248603.248611

Cite

Citations (50)

Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data : SIGMOD'93, Washington, DC, May 26-28, 1993

Association for Computing Machinery eBooks (1993)

Peter Buneman Sushil Jajodia

Source

Cite

Citations (11)

RemIX

William Waites James Sweet Roger Baig Peter Buneman Marwan Fayed

The concept of the \ac{IXP}, an Ethernet fabric central to the structure of the global Internet, is largely absent from the development of community-driven collaborative network infrastructure. The reasons for this are two-fold. \acp{IXP} exist in central, typically urban, environments where strong network infrastructure ensures high levels of connectivity. Between rural and remote regions, where networks are separated by distance and terrain, no such infrastructure exists. In this paper we present RemIX a distributed \acp{IXP} architecture designed for the community network environment. We examine this praxis using an implementation in Scotland, with suggestions for future development and research.

10.1145/2940157.2940162

Cite

Citations (7)

Annotation algebras for RDFS Data.

Peter Buneman Egor V. Kostylev

Source

Cite

Citations (3)