Data Recovery: Data Recovery Is The Process of Salvaging Data From Damaged, Failed, Corrupted, or
Data Recovery: Data Recovery Is The Process of Salvaging Data From Damaged, Failed, Corrupted, or
Data recovery is the process of salvaging data from damaged, failed, corrupted, or inaccessible secondary storage media when it cannot be accessed normally. Often the data are being salvaged from storage media such as internal or external hard disk drives, solid state drives (SSD), USB flash, storage tapes, CDs, DVDs, RAID, and other electronics. The most common "data recovery" scenario involves an operating system (OS) failure in which case the goal is simply to copy all wanted files to another disk. This can be easily accomplished with a Live CD, most of which provide a means to mount the system drive and backup disks or removable media, and to move the files from the system disk to the backup media with a file manager or optical disc authoring software. Another scenario involves a disk-level failure, such as a compromised file system or disk partition or a hard disk failure. In any of these cases, the data cannot be easily read. Depending on the situation, solutions involve repairing the file system, partition table or master boot record, or hard disk recovery techniques ranging from software-based recovery of corrupted data to hardware replacement on a physically damaged disk. If hard disk recovery is necessary, the disk itself has typically failed permanently, and the focus is rather on a one-time recovery, salvaging whatever data can be read. In a third scenario, files have been "deleted" from a storage medium. Typically, deleted files are not erased immediately; instead, references to them in the directory structure are removed, and the space they occupy is made available for later overwriting. In the meantime, the original file may be restored. Although there is some confusion over the term, "data recovery" may also be used in the context of forensic applications or espionage.
Recovery techniques
Recovering data from physically damaged hardware can involve multiple techniques. Some damage can be repaired by replacing parts in the hard disk. This alone may make the disk usable, but there may still be logical damage. A specialized disk-imaging procedure is used to recover every readable bit from the surface. Once this image is acquired and saved on a reliable medium, the image can be safely analysed for logical damage and will possibly allow for much of the original file system to be reconstructed.
Hardware repair
Examples of physical recovery procedures are: removing a damaged PCB (printed circuit board) and replacing it with a matching PCB from a healthy drive, performing a live PCB swap (in which the System Area of the HDD is damaged on the target drive which is then
instead read from the donor drive, the PCB then disconnected while still under power and transferred to the target drive), read/write head assembly with matching parts from a healthy drive, removing the hard disk platters from the original damaged drive and installing them into a healthy drive, and often a combination of all of these procedures. Some data recovery companies have procedures that are highly technical in nature and are not recommended for an untrained individual. Many of these procedures will void the manufacturer's warranty.
many third world countries still lack. Also, it cannot be performed in case of physical damage to media and for such cases, the traditional in-lab recovery has to take place.
A remote, online, or managed backup service is a service that provides users with a system for the backup and storage of computer files. Online backup systems are typically built around a client software program that runs on a schedule, typically once a day, and usually at night while computers aren't in use. This program typically collects, compresses, encrypts, and transfers the data to the remote backup service provider's servers or off-site hardware.
Perhaps the most important aspect of backing up is that backups are stored in a different location from the original data. Traditional backup requires manually taking the backup media offsite. Remote backup does not require user intervention. The user does not have to change tapes, label CDs or perform other manual steps. Unlimited data retention. Backups are automatic. The correct files are backed up. Ordinary backup software is often installed with a list of files to be backed up. This set of files usually represents the state of the system when the software was installed, and often misses critical files, like files that get added later. Some remote backup services will work continuously, backing up files as they are changed. Most remote backup services will maintain a list of versions of your files. Most remote backup services will use a 128 - 448 bit encryption to send data over unsecured links (ie internet) A few remote backup services can reduce backup by only transmitting changed binary data bits
Depending on the available network bandwidth, the restoration of data can be slow. Because data is stored offsite, the data must be recovered either via the Internet or via a disk shipped from the online backup service provider.
Some backup service providers have no guarantee that stored data will be kept private - for example, from employees. As such, most recommend that files be encrypted. It is possible that a remote backup service provider could go out of business or be purchased, which may affect the accessibility of one's data or the cost to continue using the service. If encryption password is lost, no more data recovery will be possible. However with managed services this should not be a problem. Residential broadband services often have monthly limits that preclude large backups. They are also usually asymmetric; the user-to-network link regularly used to store backups is much slower than the network-to-user link used only when data is restored.
Typical features
Encryption Data should be encrypted before it is sent across the internet, and it should be stored in its encrypted state. Encryption should be at least 256 bits, and the user should have the option of using his own encryption key, which should never be sent to the server. Network backup A backup service supporting network backup can back up multiple computers, servers or Network Attached Storage appliances on a local area network from a single computer or device. Continuous backup - Continuous Data Protection Allows the service to back up continuously or on a predefined schedule. Both methods have advantages and disadvantages. Most backup services are schedulebased and perform backups at a predetermined time. Some services provide continuous data backups which are used by large financial institutions and large online retailers. However, there is typically a trade-off with performance and system resources. File-by-File Restore The ability for users to restore files themselves, without the assistance of a Service Provider by allowing the user select files by name and/or folder. Some services allow users to select files by searching for filenames and folder names, by dates, by file type, by backup set, and by tags. Online access to files Some services allow you to access backed-up files via a normal web browser. Many services do not provide this type of functionality. Data compression Data will typically be compressed with a lossless compression algorithm to minimize the amount of bandwidth used. Differential data compression A way to further minimize network traffic is to transfer only the binary data that has changed from one day to the next, similar to the open source file transfer
service Rsync. More advanced online backup services use this method rather than transfer entire files. Bandwidth usage User-selectable option to use more or less bandwidth; it may be possible to set this to change at various times of day.
Database administrator
A database administrator (short form DBA) is a person responsible for the design, implementation, maintenance and repair of an organization's database. They are also known by the titles Database Coordinator or Database Programmer, and is closely related to the Database Analyst, Database Modeller, Programmer Analyst, and Systems Manager. The role includes the development and design of database strategies, monitoring and improving database performance and capacity, and planning for future expansion requirements. They may also plan, co-ordinate and implement security measures to safeguard the database.
Skills
Strong organizational skills Strong logical and analytical thinker Ability to concentrate and pay close attention to detail Willing to pursue education throughout your career
Duties
A database administrator's activities can be listed as below:
Transferring Data Replicating Data Maintaining database and ensuring its availability to users Controlling privileges and permissions to database users Monitoring database performance Database backup and recovery Database security
The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice.
Example of a Network Model. The network model's original inventor was Charles Bachman, and it was developed into a standard specification published in 1969 by the CODASYL Consortium.
ADVANTAGES
The network model can handle the one-to-many and many-to-many relationships.
In the network database terminology, a relationship is a set. Each set comprises of two types of records.- an owner record and a member record, In a network model an application can access an owner record and all the member records within a set.
Data Integrity
In a network model, no member can exist without an owner. A user must therefore first define the owner record and then the member record. This ensures the integrity.
Data Independence
The network model draws a clear line of demarcation between programs and the complex physical storage details. The application programs work independently of the data. Any changes made in the data characteristics do not affect the application program. DISADVANTAGES
System complexity
In a network model, data are accessed one record at a time. This males it essential for the database designers, administrators, and programmers to be familiar with the internal data structures to gain access to the data. Therefore, a user friendly database management system cannot be created using the network model
Making structural modifications to the database is very difficult in the network database model as the data access method is navigational. Any changes made to the database structure require the application programs to be modified before they can access data. Though the network model achieves data independence, it still fails to achieve structural independence.
Object database
An object database (also object-oriented database management system) is a database management system in which information is represented in the form of objects as used in object-oriented programming. Object databases are a niche field within the broader database management system (DBMS) market dominated by relational database management systems. Object databases have been considered since the early 1980s and 1990s, but they have made little impact on mainstream commercial data processing, though there is some usage in specialized areas.
Potential advantages:
Objects don't require assembly and disassembly saving coding time and execution time to assemble or disassemble objects. Reduced paging. Easier navigation. Better concurrency control - a hierarchy of objects may be locked. Data model is based on the real world. Works well for distributed architectures. Less code required when applications are object oriented.
Potential disadvantages:
Lower efficiency when data is simple and relationships are simple. Relational tables are simpler. Late binding may slow access speed. More user tools exist for RDBMS. Standards for RDBMS are more stable. Support for RDBMS is more certain and change is less likely to be required.
Data security
Data security is the means of ensuring that data is kept safe from corruption and that access to it is suitably controlled. Thus data security helps to ensure privacy. It also helps in protecting personal data.
hardware (see disk encryption hardware). Disk encryption is often referred to as on-thefly encryption ("OTFE") or transparent encryption.
Software based security solutions encrypt the data to prevent data from being stolen. However, a malicious program or a hacker may corrupt the data in order to make it unrecoverable or unusable. Similarly, encrypted operating systems can be corrupted by a malicious program or a hacker, making the system unusable. Hardware-based security solutions can prevent read and write access to data and hence offers very strong protection against tampering and unauthorized access. Hardware based or assisted computer security offers an alternative to software-only computer security. Security tokens such as those using PKCS#11 may be more secure due to the physical access required in order to be compromised. Access is enabled only when the token is connected and correct PIN is entered (see two factor authentication). However, dongles can be used by anyone who can gain physical access to it. Newer technologies in hardware based security solves this problem offering fool proof security for data. Working of Hardware based security: A hardware device allows a user to login, logout and to set different privilege levels by doing manual actions. The device uses biometric technology to prevent malicious users from logging in, logging out, and changing privilege levels. The current state of a user of the device is read by controllers in peripheral devices such as harddisks. Illegal access by a malicious user or a malicious program is interrupted based on the current state of a user by harddisk and DVD controllers making illegal access to data impossible. Hardware based access control is more secure than protection provided by the operating systems as operating systems are vulnerable to malicious attacks by viruses and hackers. The data on harddisks can be corrupted after a malicious access is obtained. With hardware based protection, software cannot manipulate the user privilege levels, it is impossible for a hacker or a malicious program to gain access to secure data protected by hardware or perform unauthorized privileged operations. The hardware protects the operating system image and file system privileges from being tampered. Therefore, a completely secure system can be created using a combination of hardware based security and secure system administration policies.
Backups
Backups are used to ensure data which is lost can be recovered
Data Masking
Data Masking of structured data is the process of obscuring (masking) specific data within a database table or cell to ensure that data security is maintained and sensitive information is not exposed to unauthorized personnel. This may include masking the data from users (for example so banking customer representatives can only see the last 4 digits of a customers national identity number), developers (who need real production data to test new software releases but should not be able to see sensitive financial data), outsourcing vendors, etc.
Data Erasure
Data erasure is a method of software-based overwriting that completely destroys all electronic data residing on a hard drive or other digital media to ensure that no sensitive data is leaked when an asset is retired or reused.
Database normalization
In the design of a relational database management system (RDBMS), the process of organizing data to minimize redundancy is called normalization. The goal of database normalization is to decompose relations with anomalies in order to produce smaller, wellstructured relations. Normalization usually involves dividing large tables into smaller (and less redundant) tables and defining relationships between them. The objective is to isolate data so that additions, deletions, and modifications of a field can be made in just one table and then propagated through the rest of the database via the defined relationships.
Objectives of normalization
A basic objective of the first normal form defined by Codd in 1970 was to permit data to be queried and manipulated using a "universal data sub-language" grounded in first-order logic.[8] (SQL is an example of such a data sub-language, albeit one that Codd regarded as seriously flawed.)[9] The objectives of normalization beyond 1NF (First Normal Form) were stated as follows by Codd: 1. To free the collection of relations from undesirable insertion, update and deletion dependencies; 2. To reduce the need for restructuring the collection of relations as new types of data are introduced, and thus increase the life span of application programs; 3. To make the relational model more informative to users;
4. To make the collection of relations neutral to the query statistics, where these statistics are liable to change as time goes by. E.F. Codd, "Further Normalization of the Data Base Relational Model"[10] The sections below give details of each of these objectives.
An update anomaly. Employee 519 is shown as having different addresses on different records.
An insertion anomaly. Until the new faculty member, Dr. Newsome, is assigned to teach at least one course, his details cannot be recorded.
A deletion anomaly. All information about Dr. Giddens is lost when he temporarily ceases to be assigned to any courses. When an attempt is made to modify (update, insert into, or delete from) a table, undesired side-effects may follow. Not all tables can suffer from these side-effects; rather, the sideeffects can only arise in tables that have not been sufficiently normalized. An insufficiently normalized table might have one or more of the following characteristics:
The same information can be expressed on multiple rows; therefore updates to the table may result in logical inconsistencies. For example, each record in an "Employees' Skills" table might contain an Employee ID, Employee Address, and Skill; thus a change of address for a particular employee will potentially need to
be applied to multiple records (one for each of his skills). If the update is not carried through successfullyif, that is, the employee's address is updated on some records but not othersthen the table is left in an inconsistent state. Specifically, the table provides conflicting answers to the question of what this particular employee's address is. This phenomenon is known as an update anomaly. There are circumstances in which certain facts cannot be recorded at all. For example, each record in a "Faculty and Their Courses" table might contain a Faculty ID, Faculty Name, Faculty Hire Date, and Course Codethus we can record the details of any faculty member who teaches at least one course, but we cannot record the details of a newly-hired faculty member who has not yet been assigned to teach any courses except by setting the Course Code to null. This phenomenon is known as an insertion anomaly. There are circumstances in which the deletion of data representing certain facts necessitates the deletion of data representing completely different facts. The "Faculty and Their Courses" table described in the previous example suffers from this type of anomaly, for if a faculty member temporarily ceases to be assigned to any courses, we must delete the last of the records on which that faculty member appears, effectively also deleting the faculty member. This phenomenon is known as a deletion anomaly.
With this design, though, the database can answer only that one single query. It cannot by itself answer interesting but unanticipated queries: What is the most-wished-for book? Which customers are interested in WWII espionage? How does Lord Byron stack up against his contemporary poets? Answers to these questions must come from special adaptive tools completely separate from the database. One tool might be software written especially to handle such queries. This special adaptive software has just one single purpose: in effect to normalize the non-normalized field. Unforeseen queries can be answered trivially, and entirely within the database framework, with a normalized table.
[edit] Example
Querying and manipulating the data within an unnormalized data structure, such as the following non-1NF representation of customers' credit card transactions, involves more complexity than is really necessary: Customer Jones Wilkins Transactions Tr. ID Date Amount 12890 14-Oct-2003 87 12904 15-Oct-2003 50 Tr. ID Date Amount 12898 14-Oct-2003 21 Tr. ID Date Amount 12907 15-Oct-2003 18 14920 20-Nov-2003 70 15003 27-Nov-2003 60
Stevens
To each customer there corresponds a repeating group of transactions. The automated evaluation of any query relating to customers' transactions therefore would broadly involve two stages: 1. Unpacking one or more customers' groups of transactions allowing the individual transactions in a group to be examined, and 2. Deriving a query result based on the results of the first stage For example, in order to find out the monetary sum of all transactions that occurred in October 2003 for all customers, the system would have to know that it must first unpack the Transactions group of each customer, then sum the Amounts of all transactions thus obtained where the Date of the transaction falls in October 2003. One of Codd's important insights was that this structural complexity could always be removed completely, leading to much greater power and flexibility in the way queries
could be formulated (by users and applications) and evaluated (by the DBMS). The normalized equivalent of the structure above would look like this: Customer Tr. ID Date Amount Jones 12890 14-Oct-2003 87 Jones 12904 15-Oct-2003 50 Wilkins 12898 14-Oct-2003 21 Stevens 12907 15-Oct-2003 18 Stevens 14920 20-Nov-2003 70 Stevens 15003 27-Nov-2003 60 Now each row represents an individual credit card transaction, and the DBMS can obtain the answer of interest, simply by finding all rows with a Date falling in October, and summing their Amounts. The data structure places all of the values on an equal footing, exposing each to the DBMS directly, so each can potentially participate directly in queries; whereas in the previous situation some values were embedded in lower-level structures that had to be handled specially. Accordingly, the normalized design lends itself to general-purpose query processing, whereas the unnormalized design does not.
Example of a hierarchical model In a database, an entity type is the equivalent of a table; each individual record is represented as a row and an attribute as a column. Entity types are related to each other using 1: N mapping, also known as one-to-many relationships. this model is recognized as the first data base model created by IBM in the 1960s. The most recognized and used hierarchical databases are IMS developed by IBM and Windows Registry by Microsoft.
In this model, the employee data table represents the "parent" part of the hierarchy, while the computer table represents the "child" part of the hierarchy. In contrast to tree structures usually found in computer software algorithms, in this model the children point to the parents. As shown, each employee may possess several pieces of computer equipment, but each individual piece of computer equipment may have only one employee owner. Consider the following structure:
EmpNo Designation ReportsTo 10 Director 20 Senior Manager 10 30 Typist 20 40 Programmer 20 In this, the "child" is the same type as the "parent". The hierarchy stating EmpNo 10 is boss of 20, and 30 and 40 each report to 20 is represented by the "ReportsTo" column. In Relational database terms, the ReportsTo column is a foreign key referencing the EmpNo column. If the "child" data type were different, it would be in a different table, but there would still be a foreign key referencing the EmpNo column of the employees table. This simple model is commonly known as the adjacency list model, and was introduced by Dr. Edgar F. Codd after initial criticisms surfaced that the relational model could not model hierarchical data.
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports processes requiring this information (for example, finding a hotel with vacancies). The term "database" refers both to the way its users view it, and to the logical and physical materialization of its data, content, in files, computer memory, and computer data storage.
Database design
Database design is the process of producing a detailed data model of a database. This logical data model contains all the needed logical and physical design choices and physical storage parameters needed to generate a design in a Data Definition Language, which can then be used to create a database. A fully attributed data model contains detailed attributes for each entity. The term database design can be used to describe many different parts of the design of an overall database system. Principally, and most correctly, it can be thought of as the logical design of the base data structures used to store the data. In the relational model these are the tables and views. In an object database the entities and relationships map directly to object classes and named relationships. However, the term database design could also be used to apply to the overall process of designing, not just the base data structures, but also the forms and queries used as part of the overall database application within the database management system (DBMS).[1] The process of doing database design generally consists of a number of steps which will be carried out by the database designer. Usually, the designer must:
Determine the relationships between the different data elements. Superimpose a logical structure upon the data on the basis of these relationships.
Database designs also include ER(Entity-relationship model) diagrams. An ER diagram is a diagram that helps to design databases in an efficient way. Attributes in ER diagrams are usually modeled as an oval with the name of the attribute, linked to the entity or relationship that contains the attribute. Within the relational model the final step can generally be broken down into two further steps, that of determining the grouping of information within the system, generally determining what are the basic objects about which information is being stored, and then determining the relationships between these groups of information, or objects. This step is not necessary with an Object database.[2]
8. Apply the normalization rules - Apply the data normalization rules to see if your tables are structured correctly. Make adjustments to the tables
indexing options and other parameters residing in the DBMS data dictionary. It is the detailed design of a system that includes modules & the database's hardware & software specifications of the system.