Data Critique

The photo of the physical records of shipwrecks used to create the database at the New Jersey Maritime Museum
Our chosen data set contains information of shipwrecks from 1705 to 2013 off the coast of New Jersey. This set contains ships’ detailed information like their names, vessel types, materials, weights, owners and years built. If successfully investigated, information about these misfortunate ships’ accidents is also recorded. For instance, it contains their accidents’ time, locations, ships’ home and departure ports and causes of shipwrecks. As some shipwrecks could not be tracked, some of these accidents’ information is not recorded. However, the data set has a  general pattern that the more recent the shipwreck, the more complete its information.
The data was sourced from the maritime museum in New Jersey. The data is produced from several pieces of the museum’s scrapbook, which is collected from different social mediums, such as newspaper archives, ship’s logs, diaries, USLSS annual reports, shipwreck books, periodicals, and internet sites. Each record is broken up into its own individual folder in the museum’s records and listed according to the name of ship. The original sources used to create this data set are the 4722 sunken ships in the Atlantic Ocean, primarily off the coast of New Jersey, Delaware River, and Delaware Bay. The data sheet was created using a Google experimental tool called Fusion Tables. This information has been accumulated over the years, presumably added to whenever a shipwreck is reported. The spreadsheet categorizes the information into searchable columns, allowing the reader to quickly scan through the details of each incident and manipulate the data.
Given the information available in the dataset, we can gain more insights into causes of shipwreck by taking the events described as case studies or finding correlations between different variables. For example, we may find relationships between the type of vessel and the likelihood of the ship to sink, or whether or not some vessels are more prone to a certain accident that will cause a shipwreck than others. Finding out what factors may contribute into a ship sinking may help us understand what to avoid and what to incorporate when building ships; thus, the information revealed through this dataset can be used for practical purposes. In addition to interesting relationships we may find between the variables in the dataset, we can also see if there are any overwhelming common factors among the cases and conduct additional research to explain the trend. For example, if we find that a particular period had more cases of shipwrecks than others, additional investigation might shed some light into the reasons why this is the case.
However, the dataset also leaves out some information that might be important or relevant for study. Information about ships that were never found or not known to be lost could not have been included. Ships outside of the geographical and temporal boundaries of this dataset are also not included. Furthermore, the dataset indicates when the ship was built and when it was lost but not how long it took to find the ship after the accident. Knowing how much time passed between the time the ship was lost and found could give  insight into how complete or accurate some parts of the record are. Some contextual and qualitative information, such as what historical events may be connected or what seasonal factors may have been relevant to the shipwrecks, is also absent. Other sources would need to be referred to in order to find out this type of information. As such, there are many things the data cannot reveal on its own. For example, the dataset cannot answer questions such as what happened to the cargo that they carry, or why there are so few lives lost despite the ship sinking. The data set also does not specify the direction each of the ships were heading towards when they sunk nor how long the ship had been sailing prior to the accident. It includes the departure and destination ports of the ships, but this information alone does not allow researchers to form an accurate idea about the exact route taken by the ship.
Furthermore, as the data for the database was collected by a maritime museum, the spreadsheet assesses the shipwrecks using categories that would be familiar to those with a knowledge of nautical terminology.  As a result, many of the categories and terms used can raise questions for lay people or a member of the general public viewing the database. A narrative of the events is largely left out of the data and only attainable at the museum in New Jersey itself, thereby raising further concerns for its accessibility. In conclusion, at the same time the dataset contains a wealth of information and many data points to explore, it has a number of limitations that may confine the scope of our project or necessitate additional work and research to address.