毕业设计（论文）外文翻译

毕业设计（论文）外文翻译The Concepts and Design of Distributed DBMS 1. INTRODUCTION A major behind the development of database systems is the desire to integrate the operational data of an organization and to provide controlled access to the data. Although integration and controlled ...

The Concepts and Design of Distributed DBMS 1. INTRODUCTION A major behind the development of database systems is the desire to integrate the operational data of an organization and to provide controlled access to the data. Although integration and controlled access may imply centralization, this is not the intention. In fact, the development of computer networks promotes a decentralized mode of work. This decentralized approach mirrors the organizational structure of many companies, which are logically distributed into divisions, departments, projects, and so on, and physically distributed into offices, plants, factories, where each nit maintains its own operational data. The shareability of the data and the efficiency of data access should be improved by the development of a distributed database system that reflects this organizational structure, makes the data in all units accessible, and stores data proximate to the location where it is most frequently used. Distributed DBMSs should help resolve the islands of information problem. Databases are sometimes regarded, as electronic islands that are distinct and generally inaccessible places, like remote islands. This may be a result of geographical separation, incompatible computer architectures, incompatible communication protocols, and so on. Integrating the databases into a logical whole may prevent this way of thinking. 2．Concepts To start the discussion of distributed DBMSs, we first give a definition of a distributed database. Distributed database: a logically interrelated collection of shared data physically distributed over a computer network. Following on from this we have the definition of distributed DBMS. Distributed DBMS: the software system that permits the management of the distributed database and makes the distribution transparent to users. A distributed database management system consists of a single logical database that is split into a number of fragments. Each fragment is stored on one or more computers under the control of a separate DBMS, with the computers connected by a communications network. Each site is capable of independently processing user requests that require access to local data and is also capable of processing data stored on other computers in the network. Users access the distributed database via application. Applications are classified as those that do not require data from other sites and those that do require data from other sites. We require a DBMSs to have at least one global application. A DDBMS therefore has the following characteristics: A collection of logically related shared data; The data is split into a number of fragments; Fragments may be replicated; Fragments/replicas are allocated to sites; The sites are linked by a communications network; The data at each site is under the control of a DBMS; The DBMS at each site can handle local applications, autonomously; Each DBMS participates in at least one global application; From the definition of the DDBMS, the system is expected to make the distribution transparent to the user. Thus, the fact that a distributed database is split tinto fragments that can be stored on different computers and perhaps replicated, should be hidden from the user. The objective of transparency is to make the distributed system appear like a centralized system. This is sometimes referred to as the fundamental principle of distributed DBMSs. Advantages and Disadvantages of DDBMSs The distribution of data and applications has potential advantages over traditional centralized database systems. Unfortunately, there are also disadvantages. In this section, we review the advantages and disadvantages of the DDBMS. Advantages Reflects organizational structure Many organizations are naturally distributed over several locations. For example, DreamHome has many officers in different cities. It is natural for databases used in such an application to be distributed over these locations. DreamHome may keep a database at each branch office containing details of such things as the staff who work at that location, the properties that are for rent, and the client’s whoown or wish to rent out these properties. The staff at a branch office will make local inquiries of the databases. The company headquarters may wish to make global inquiries involving the access of data at all or a number of branches. Improved shareability and local autonomy The geographical distribution of an organization can be reflected in the distribution of the data; users at one site can access data stored at other sties. Data can be placed at the site close to the users who normally use that data. In this way, users have local control of the data, and they can consequently establish and enforce local policies regarding the use of this data. A global database administrator is responsible for the entire system. Generally, part of this responsibility is devolved to the local level, so that the local DBA can manage the local DBMS. Improved availability In a centralized DBMS, a computer failure terminates the operations of the DBMS. However, a failure at one site of a DBMS, or a failure of a communication link making some sites inaccessible, does not make the entire system inoperable. Distributed DBMSs are designed to continue to function despite such failures. If a single node fails, the system may be able to reroute the failed node’s requests to another site. Improved reliability As data may be replicated so that it exists at more than one site, the failure of a node or a communication link does not necessarily make the data inaccessible. Improved performance As the data is located near the site of ‘greatest demand’, and given the inherent parallelism of distributed DBMSs, speed of database access may be better than that achievable form a remote centralized database. Furthermore, since each site handles only a part of the entire database, there may not be the same contention for CPU and I/O services as characterized by a centralized DBMS. Economics In the 1960s,computing power was calculated according to the square of the costs of the equipment: three times the cost would provide nine times the power. This was known as Grouch’s Law. However, it is now generally accepted that it costs much less to create a system of smaller computers with the equivalent power of a single large computer. This makes it more cost-effective for corporate divisions and departments to obtain separate computers. It is also much more cost-effective to add workstations to a network than to update a mainframe system. The second potential cost saving occurs where databases are geographically remote and the applications require access to distributed data. In such cases, owing to the relative expense of data being transmitted across the network as opposed to the cost of local access, it may be much more economical to partition the application and perform the processing locally at each site. Modular growth In a distributed environment, it is much easier to handle expansion. New sites can be added to the network without affecting the operations of other sites. This flexibility allows an organization to expand relatively easily. Increasing database size can usually be handled by adding processing and storage power to the network. In a centralized DBMS, a growth may entail changes to both hardware and software. Complexity A distributed DBMS that hides the distributed nature form the user and provides an acceptable level of performance, reliability is inherently more complex than a centralized DBMS. The fact that data replication adequately, there will be degradation in availability, reliability, and performance compared with the centralized system, and the advantages we cited above will become disvantages. Cost Increased complexity means that we can expect the procurement and maintenance costs for a DDBMS to be higher than those for a centralized DBMS. Furthermore, a distributed DBMS requires additional hardware to establish a network between sites. These are ongoing communication costs incurred with the use of this network. There are also additional labor costs to manage and maintain the local DBMSs and the underlying network. Security In a centralized system, access to the data can be easily controlled. However, in a distributed DBMS not only does access to replicated data have to be controlled in multiple locations, but the network itself has to be made secure. In the past, networks were regarded as an insecure communication medium. Although this is still partially true, significant developments have been made to make network more secure. Integrity control more difficult Database integrity refers to the validity and consistency of stored data. Integrity is usually expressed in terms of constraints, which are consistency rules that the database is not permitted to violate. Enforcing integrity constraints generally requires access to a large amount of data that defines the constraint but which is not involved in the actual update operation itself. In a distributed DBMS, the communication and processing costs that are required to enforce integrity constraints may be prohibitive. We return to this problem in Section Lack of standards Although distributed DBMSs depend on effective communication, we are only now starting to see the appearance of standard communication and data access protocols. This lack of standards has significantly limited the potential of distributed DBMSs. There are also no tools or methodologies to help users convert a centralized DBMS into a distributed DBMS. Lack of experience General-purpose distributed DBMSs have not been widely accepted, although many of the protocols and problems are well understood. Consequently, we do not yet have the same level of experience in industry as we have with centralized DBMSs, For a prospective adopter of this technology, this may be a significant deterrent. Database design more complex Besides the normal difficulties of designing a centralized database, the design of a distributed database has to take account of fragmentation of data, allocation of fragments to specific sites, and data replication. 分布式DBMS的概念与设计 1 介绍推动数据库系统发展的一个主要因素是，人们希望将一个企业的操作数据综合起来并能提供对数据的受控访问，经管综合和受控访问意味着集中管理，但这并不是目的。实际上，计算机网络的发展促进了分散式的作业模式。这种分散的方式反映了许多公司的组织结果：在逻辑上分成多个分公司，部门，项目等，而在体制上分为办公室，车间，工厂，每一个单元都维护着自己的操作数据。数据的共享和数据访问效率的提高以来与分布式数据库系统的发展，分布式数据库反映了上述组织结构，使得每个单元的数据都是可访问的，并将最近的数据存放于最常用的位置。分布式DBMS有助于结局信息孤岛的问题，有时，数据库被看做是孤立的，不可访问的电子岛，就像遥远的岛屿一样。这可能是由于地理分割，计算机体系结构不兼容或通信协议不兼容原因造成的。把数据库综合成一个逻辑整体可能会改变这种思考的方式。 2. 概念在开始讨论分布式DBMS之前，先给出分布式数据库的定义。分布式数据库：物理上分布于计算机网络中，但在逻辑上相关的共享的集合。由此可以得到分布式DBMS的定义。分布式DBMS：管理分布式数据库并使分布性对用户透明的软件系统。分布式数据库管理系统是由一个被氛围多段的逻辑数据库构成的。每个段在独立的DBMS的控制下，可以存储在一个或多个通过通信网络互联起来的计算机上。每一个结点都可以独立的处理用户访问本地数据的请求，并且也可以处理网络上其他计算机存储的数据。用户是通过听用来访问分布市数据库的，应用又可氛围不需要从其他结点获得数据的应用和确实需要从其他结点获得数据的应用，一般要求DDBMS至少包含一个全局应用。因此，DDBMS应当具有如下特征：  逻辑上相关的共享数据的集合。  数据是分段的。  段是可以复制的。  段/副本是分配在各个结点上的。  结点是用通信网络连接起来的。  每个结点的数据都由DBMS控制，  每个结点的DBMS都能自主地处理本地应用。  每个DBMS至少参与一个全局应用。从DDBMS的定义中可以看出，系统期望分布性对于用户来讲是透明的。这样，用户就不需要知道分布式数据库是分段的，存储在多个不同计算机上，并且可能被复制等细节。透明性的目的就是要使用户使用分布式系统如同使用集中式系统一样。这常常被称为DDBMS的基本原则。这个要求为终端用户提供了强大的功能。 DDBMS的优缺点数据分布以及应用分布比传统的集中式数据库具有潜在的优势，但也存在着不足。优点：反映了组织结构许多组织都是自然的分布与各个地方。例如，DreamHome在许多城市都设有分支机构。于是该应用的数据库将很自然的分布于这些不同的地方。DreamHome的每个分支机构都有一个数据库用来记录该机构的职员信息，出租房产信息以及房产所有者的信息。本地的职员可以在本地数据库上进行本地查询，而公司的高层则可以访问所有分支机构的任何数据，进行全局查询。改进了共享性和本地自主权数据的分布可以反映出一个组织在地理上的分布，一个结点的用户可以访问其他结点上的数据。数据会存放在靠近经常是同这些数据的用户的结点上。这样，拥护就可以对数据拥有本地控制权，从而可以建立和执行关于使用这些数据的本地策略。全局数据管理员（DBA）对整个系统负责。通常也可以把责任部分地下放到本地级，所以本地DBA可以管理本地DBMS。改进了可用性对于集中式DBMS，计算机的一次故障会中断所有的DBMS操作。然而，在DDBMS系统中，一个结点的故障或通信链路的故障只会使某些结点不能被访问，但决不会中断整个系统的操作。分布式DBMS就是设计用来使系统在这些故障发生时仍然可以继续工作。即使一个结点出现故障，系统也可以把对故障结点的访问请求重定向到其他结点。提高了可靠性因为数据可以被复制而存在与多个结点之上，所以一个结点的鼓掌或通信链路故障不会妨碍对该数据的访问。改进了性能因为数据是防止在离“最大需求“最近的结点上的，而且由于分布式DBMS内在的并行机制，所以访问分布式数据库的速度肯定比访问远程集中式数据库快。而且，由于每个结点只处理整个数据库系统的部分工作，所以不会出现像集中式DBMS系统中CPU服务和I/O服务之间的激烈竞争。节约开销 20世纪60年代，计算能力是通过设备趁本的平方来衡量的：3倍的成本能产生9倍的能力。这就是著名的Grosch定律。然而，现在通常认为用相对低得多的成本建立的小型计算机系统就能够获得与大型计算机相当的计算能力。这将使各个合作部门配备独立的计算机更为经济。而且在网络中添加一个工作站也会比升级大型计算机系统更加经济。当数据库地理上是远程的，并且应用要求访问分布式数据时，也潜在地存在着第二种节约开销方式。在这种情况下，在网络中惊醒数据传送需要的开销比本地访问的开销更大，将应用进行划分并且在每个结点中执行开销将更少。模块化增长在分布式环境汇总，扩展显得更加容易一些。新结点可以添加到网络中，而不会影响其他结点的操作，这种适应性使得组织的扩展相对同意一些。可以通过增强系统的处理和存储能力来适应日益增长的数据库规模的需要。集中式DBMS的扩展会使硬件和软件都需要升级。缺点：复杂性高分布式DBMS需要对用户隐藏它分布式的本质，且要给用户提供令人满意的性能，可靠性和可用性，这就注定了它比集中式DBMS更加复杂。而且数据的可复制性更增加了分布式DBMS的复杂性。如果软件不能完善的处理数据复制，那么分布式DBMS的可用性，可靠性和性能相对起集中式DBMS而言就会降低。这是，上述的优点就会全部变成缺点。成本高复杂性的增加就以为着获得和维护DDBMS的成本会里集中式DBMS更高。而且分布式DBMS需要额外的硬件开销以建立连接各个结点的网络。而且，分布式DBMS需要额外的硬件设备来维持网络统通信。设置在管理和维护本地DBMS和地层网络时也需要一定的人力开销。安全性低在集中式系统汇总，对数据的访问是很容易控制的。而在分布式DBMS中，不仅需要对各个接上复制数据的访问惊醒控制，而且网络本身也同样如此。过去往往认为网络是不安全的通信媒介，虽然现在在某种程度上这还是正确的，但是现在网络的安全性已经大大改善了。更难以控制的完整性数据库的完整性就是指存储数据的可用性和一致性。可用性通常用一系列一致性的约束条件来表述，数据库不能违反这些规则。强制的完整性约束需要访问大量用来定义这些约束的数据，但这些数据在更新操作中并不是真正的涉及。在分布式DBMS中，完整性约束对通信和处理开销的要求使得它几乎不可能实现。缺乏标准分布式DBMS的实现基于有效的通信网络的支持，但直到现在才逐渐出现了标准的通信和数据访问协议。这些标准的缺乏严重的限制了分布式DBMS的发展潜力。而且，直到现在还没有一种工具或方法能将集中式DBMS转换为分布式DBMS。缺乏经验尽管人们已经对多用途的分布式DBMS的协议以及相关问题理解得很透彻，但多用途的分布式DBMS仍然没有被广泛的接受。因此，在分布式DBMS方面所积累的工业经验和集中式DBMS是不能相比的。这些对于未来用户来讲是个很大的障碍。数据库的设计更加复杂分布式数据库的设计除了要考虑集中式数据库设计所要考虑的所有问题之外，还要考虑到数据分段，数据段的分配以及数据复制。

                    本文档为【毕业设计（论文）外文翻译】，请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑，
                    图片更改请在作品中右键图片并更换，文字修改请直接点击文字进行修改，也可以新增和删除文档中的内容。 
 该文档来自用户分享，如有侵权行为请发邮件ishare@vip.sina.com联系网站客服，我们会及时删除。

                    [版权声明] 本站所有资料为用户分享产生，若发现您的权利被侵害，请联系客服邮件isharekefu@iask.cn，我们尽快处理。

                    本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权，请谨慎使用。

                    网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传，仅限个人学习分享使用，禁止用于任何广告和商用目的。
                

下载需要：免费已有0 人下载

立即下载

毕业设计（论文）外文翻译

你可能还喜欢