Azimuth Backup Project (Part 3)

22 January, 2017


azimuth_logo

Along with the bad news there is some good news:

• Over 380 people have pledged over $14,000 to the Azimuth Backup Project on Kickstarter, greatly surpassing our conservative initial goal of $5,000.

• Given our budget, we currently aim at backing up 40 terabytes of data, and we are well on our way to this goal. You can see what we’ve done at Our Progress, and what we’re still doing at the Issue Tracker.

• I have gotten a commitment from Danna Gianforte, the head of Computing and Communications at U. C. Riverside, that eventually the university will maintain a copy of our data. (This commitment is based on my earlier estimate that we’d have 20 terabytes of data, so I need to see if 40 is okay.)

• I have gotten two offers from other people, saying they too can hold our data.

I’m hoping that the data at U. C. Riverside will be made publicly available through a server. The other offers may involve it being held ‘secretly’ until such time as it became needed; that has its own complementary advantages.

However, the interesting problem that confronts us now is: how to spend our money?

You can see how we’re currently spending it on our Budget and Spending page. Basically, we’re paying a firm called Hetzner for servers and storage boxes.

We could simply continue to do this until our money runs out. I hope that long before then, U. C. Riverside will have taken over some responsibilities. If so, there would be a long period where our money would largely pay for a redundant backup. Redundancy is good, but perhaps there is something better.

Two members of our team, Sakari Maaranen and Greg Kochanski, have thoughts on this matter which I’d like to share. Sakari posted his thoughts on Google+, while Greg posted his in an email which he’s letting me share here.

Please read these and offer us your thoughts! Maybe you can help us decide on the best strategy!

Sakari Maaranen

For the record, my views on our strategy of using the budget that the Azimuth Climate Data Backup Project now has.

People have contributed it to this effort specifically.

Some non-government entities have offered “free hosting”. Of course the project should take any and all free offers to host our data. Those would not be spending our budget however. And they are still paying for it, even if they offered it to us “for free”.

As far as it comes to spending, I think we should think in terms of 1) terabytemonths, and 2) sufficient redundancy, and do that as cost-efficiently as possible. We should not just dump the money to any takers, but think of the best bang for the buck. We owe that to the people who have contributed now.

For example, if we burn the cash quick to expensive storage, I would consider that a failure. Instead, we must plan for the best use of the budget towards our mission.

What we have promised to the people is that we back up and serve these data sets, by the money they have given to us. Let’s do exactly that.

We are currently serving the mission at approximately €0.006 per gigabytemonth at least for as long as we have volunteers to work for free. The cost could be slightly higher if we paid for professional maintenance, which should be a reasonable assumption if we plan for long term service. Volunteer work cannot be guaranteed forever, even if it works temporarily.

This is one view and the question is open to public discussion.

Greg Kochanski

Some misc thoughts.

1) As I see it, we have made some promise of serving the data (“create a better interface for getting it”) which can be an expensive thing.

UI coding isn’t all that easy, and takes some time.

Beyond that, we’ve promised to back up the data, and once you say “backup”, you’ve also made an implicit promise to make the data available.

2) I agree that if we have a backup, it is a logical extension to take continuous backups, but I wouldn’t say it’s necessary.

Perhaps the way to think about it is to ask the question, “what do our donors likely want”?

3) Clearly they want to preserve the data, in case it disappears from the Federal sites. So, that’s job 1. And, if it does disappear, we need to make it available.

3a) Making it available will require some serving CPU, disk, and network. We may need to worry about DDOS attacks, thought perhaps we could get free coverage from Akamai or Google Project Shield.

3b) Making it available may imply paying some students to write Javascript and HTML to put up a front-end to allow people to access the data we are collecting.

Not all the data we’re collecting is in strictly servable form. Some of the databases, for example aren’t usefully servable in the form we collect, and we know some links will be broken because of missing pages, or because of wget’s design flaw.*

[* Wget stores http://a/b/c as a file, a/b/c, where a/b is a directory. Wget stores http://a/b as a file a/b, where a/b is a file.

Therefore, both cannot exist simultaneously on disk. If they do, wget drops one.]

Points 3 & 3a imply that we need to keep some money in the bank until either the websites are taken down, or we decide that the threat has abated. So, we need to figure out how much money to keep as a serving reserve. It doesn’t sound like UCR has committed to serve the data, though you could perhaps ask.

Beyond the serving reserve, I think we are free to do better backups (i.e. more than one data collection), and change detection.


Saving Climate Data (Part 4)

21 January, 2017

At noon today in Washington DC, while Trump was being inaugurated, all mentions of “climate change” and “global warming” were eliminated from the White House website.

Well, not all. The word “climate” still shows up here:

President Trump is committed to eliminating harmful and unnecessary policies such as the Climate Action Plan….

There are also reports that all mentions of climate change will be scrubbed from the website of the Environmental Protection Agency, or EPA.

From Motherboard

Let me quote from this article:

• Jason Koebler, All references to climate change have been deleted from the White House website, Motherboard, 20 January 2017.

Scientists and professors around the country had been rushing to download and rehost as much government science as was possible before the transition, based on a fear that Trump’s administration would neglect or outright delete government information, databases, and web applications about science. Last week, the Radio Motherboard podcast recorded an episode about these efforts, which you can listen to below, or anywhere you listen to podcasts.

The Internet Archive, too, has been keeping a close watch on the White House website; President Obama’s climate change page had been archived every single day in January.

So far, nothing on the Environmental Protection Agency’s website has changed under Trump, but a report earlier this week from Inside EPA, a newsletter and website that reports on the agency, suggested that pages about climate are destined to be cut within the first few weeks of his presidency.

Scientists I’ve spoken to who are archiving websites say they expect scientific data on the NASA, NOAA, Department of Energy, and EPA websites to be neglected or deleted eventually. They say they don’t expect agency sites to be updated immediately, but expect it to play out over the course of months. This sort of low-key data destruction might not be the type of censorship people typically think about, but scientists are treating it as such.

From Technology Review

Greg Egan pointed out another good article, on MIT’s magazine:

• James Temple, Climate data preservation efforts mount as Trump takes office, Technology Review, 20 January 2010.

Quoting from that:

Dozens of computer science students at the University of California, Los Angeles, will mark Inauguration Day by downloading federal climate databases they fear could vanish under the Trump Administration.

Friday’s hackathon follows a series of grassroots data preservation efforts in recent weeks, amid increasing concerns the new administration is filling agencies with climate deniers likely eager to cut off access to scientific data that undermine their policy views. Those worries only grew earlier this week, when Inside EPA reported website that the Environmental Protection Agency transition team plans to scrub climate data from the agency’s website, citing a source familiar with the team.

Earlier federal data hackathons include the “Guerrilla Archiving” event at the University of Toronto last month, the Internet Archive’s Gov Data Hackathon in San Francisco at the beginning of January, and the DataRescue Philly event at the University of Pennsylvania last week.

Much of the collected data is being stored in the servers of the End of Term Web Archive, a collaborative effort to preserve government websites at the conclusion of presidential terms. The University of Pennsylvania’s Penn Program in Environmental Humanities launched the separate DataRefuge project, in part to back up environmental data sets that standard Web crawling tools can’t collect.

Many of the groups are working off a master list of crucial data sets from NASA, the National Oceanic and Atmospheric Administration, the U.S. Geological Survey, and other agencies. Meteorologist and climate journalist Eric Holthaus helped prompt the creation of that crowdsourced list with a tweet early last month.

Other key developments driving the archival initiatives included reports that the transition team had asked Energy Department officials for a list of staff who attended climate change meetings in recent years, and public statements from senior campaign policy advisors arguing that NASA should get out of the business of “politically correct environmental monitoring.”

“The transition team has given us no reason to believe that they will respect scientific data, particularly when it’s inconvenient,” says Gretchen Goldman, research director in the Center for Science and Democracy at the Union of Concerned Scientists. These historical databases are crucial to ongoing climate change research in the United States and abroad, she says.

To be clear, the Trump camp hasn’t publicly declared plans to erase or eliminate access to the databases. But there is certainly precedent for state and federal governments editing, removing, or downplaying scientific information that doesn’t conform to their political views.

Late last year, it emerged that text on Wisconsin’s Department of Natural Resources website was substantially rewritten to remove references to climate change. In addition, an extensive Congressional investigation concluded in a 2007 report that the Bush Administration “engaged in a systematic effort to manipulate climate change science and mislead policymakers and the public about the dangers of global warming.”

In fact these Bush Administration efforts were masterminded by Myron Ebell, who Trump chose to lead his EPA transition team!

Continuing:

In fact, there are wide-ranging changes to federal websites with every change in administration for a variety of reasons. The Internet Archive, which collaborated on the End of Term project in 2008 and 2012 as well, notes that more than 80 percent of PDFs on .gov sites disappeared during that four-year period.

The organization has seen a surge of interest in backing up sites and data this year across all government agencies, but particularly for climate information. In the end, they expect to collect well more than 100 terabytes of data, close to triple the amount in previous years, says Jefferson Bailey, director of Web archiving.

In fact the Azimuth Backup Project alone may gather about 40 terabytes!

From Inside EPA

And then there’s this view from inside the Environmental Protection Agency:

• Dawn Reeves, Trump transition preparing to scrub some climate data from EPA Website, Inside EPA, January 17, 2017

The incoming Trump administration’s EPA transition team intends to remove non-regulatory climate data from the agency’s website, including references to President Barack Obama’s June 2013 Climate Action Plan, the strategies for 2014 and 2015 to cut methane and other data, according to a source familiar with the transition team.

Additionally, Obama’s 2013 memo ordering EPA to establish its power sector carbon pollution standards “will not survive the first day,” the source says, a step that rule opponents say is integral to the incoming administration’s pledge to roll back the Clean Power Plan and new source power plant rules.

The Climate Action Plan has been the Obama administration’s government-wide blueprint for addressing climate change and includes information on cutting domestic greenhouse gas (GHG)emissions, including both regulatory and voluntary approaches; information on preparing for the impacts of climate change; and information on leading international efforts.

The removal of such information from EPA’s website — as well as likely removal of references to such programs that link to the White House and other agency websites — is being prepped now.

The transition team’s preparations fortify concerns from agency staff, environmentalists and many scientists that the Trump administration is going to destroy reams of EPA and other agencies’ climate data. Scientists have been preparing for this possibility for months, with many working to preserve key data on private websites.

Environmentalists are also stepping up their efforts to preserve the data. The Sierra Club Jan. 13 filed a Freedom of Information Act request seeking reams of climate-related data from EPA and the Department of Energy (DOE), including power plant GHG data. Even if the request is denied, the group said it should buy them some time.

“We’re interested in trying to download and preserve the information, but it’s going to take some time,” Andrea Issod, a senior attorney with the Sierra Club, told Bloomberg. “We hope our request will be a counterweight to the coming assault on this critical pollution and climate data.”

While Trump has pledged to take a host of steps to roll back Obama EPA climate and other high-profile actions actions on his first day in office, transition and other officials say the date may slip.

“In truth, it might not [happen] on the first day, it might be a week,” the source close to the transition says of the removal of climate information from EPA’s website. The source adds that in addition to EPA, the transition team is also looking at such information on the websites of DOE and the Interior Department.

Additionally, incoming Trump press secretary Sean Spicer told reporters Jan. 17 that not much may happen on Inauguration Day itself, but to expect major developments the following Monday, Jan. 23. “I think on [Jan. 23] you’re going to see a big flurry of activity” that is expected to include the disappearance of at least some EPA climate references.

Until Trump is inaugurated on Jan. 20, the transition team cannot tell agency staff what to do, and the source familiar with the transition team’s work is unaware of any communications requiring language removal or beta testing of websites happening now, though it appears that some of this work is occurring.

“We can only ask for information at this point until we are in charge. On [Jan. 20] at about 2 o’clock, then they can ask [staff] to” take actions, the source adds.

Scope & Breadth

The scope and breadth of the information to be removed is unclear. While it is likely to include executive actions on climate, it does not appear that the reams of climate science information, including models, tools and databases on the EPA Office of Research & Development’s (ORD) website will be impacted, at least not immediately.

ORD also has published climate, air and energy strategic research action plans, including one for 2016-2019 that includes research to assess impacts; prevent and reduce emissions; and prepare for and respond to changes in climate and air quality.

But other EPA information maintained on its websites including its climate change page and its “What is EPA doing about climate change” page that references the Climate Action Plan, the 2014 methane strategy and a 2015 oil and gas methane reduction strategy are expected targets.

Another possible target is new information EPA just compiled—and hosted a Jan. 17 webinar to discuss—on climate change impacts to vulnerable communities.

One former EPA official who has experience with transitions says it is unlikely that any top Obama EPA official is on board with this. “I would think they would be violently against this. . . I would think that the last thing [EPA Administrator] Gina McCarthy would want to do would to be complicit in Trump’s effort to purge the website” of climate-related work, and that if she knew she would “go ballistic.”

But the former official, the source close to the transition team and others note that EPA career staff is fearful and may be undertaking such prep work “as a defensive maneuver to avoid getting targeted,” the official says, adding that any directive would likely be coming from mid-level managers rather than political appointees or senior level officials.

But while the former official was surprised that such work might be happening now, the fact that it is only said to be targeting voluntary efforts “has a certain ring of truth to it. Someone who is knowledgeable would draw that distinction.”

Additionally, one science advocate says, “The people who are running the EPA transition have a long history of sowing misunderstanding about climate change and they tend to believe in a vast conspiracy in the scientific community to lie to the public. If they think the information is truly fraudulent, it would make sense they would try to scrub it. . . . But the role of the agency is to inform the public . . . [and not to satisfy] the musings of a band of conspiracy theorists.”

The source was referring to EPA transition team leader Myron Ebell, a long-time climate skeptic at the Competitive Enterprise Institute, along with David Schnare, another opponent of climate action, who is at the Energy & Environment Legal Institute.

And while “a new administration has the right to change information about policy, what they don’t have the right to do is change the scientific information about policies they wish to put forward and that includes removing resources on science that serve the public.”

The advocate adds that many state and local governments rely on EPA climate information.

EPA Concern

But there has been plenty of concern that such a move would take place, especially after transition team officials last month sought the names of DOE employees who worked on climate change, raising alarms and cries of a “political witch hunt” along with a Dec. 13 letter from Sen. Maria Cantwell (D-WA) that prompted the transition team to disavow the memo.

Since then, scientists have been scrambling to preserve government data.

On Jan. 10, High Country News reported that on a Saturday last month, 150 technology specialists, hackers, scholars and activists assembled in Toronto for the “Guerrilla Archiving Event: Saving Environmental Data from Trump” where the group combed the internet for key climate and environmental data from EPA’s website.

“A giant computer program would then copy the information onto an independent server, where it will remain publicly accessible—and safe from potential government interference.”

The organizer of the event, Henry Warwick, said, “Say Trump firewalls the EPA,” pulling reams of information from public access. “No one will have access to the data in these papers” unless the archiving took place.

Additionally, the Union of Concerned Scientists released a Jan. 17 report, “Preserving Scientific Integrity in Federal Policy Making,” urging the Trump administration to retain scientific integrity. It wrote in a related blog post, “So how will government science fare under Trump? Scientists are not just going to wait and see. More than 5,500 scientists have now signed onto a letter asking the president-elect to uphold scientific integrity in his administration. . . . We know what’s at stake. We’ve come too far with scientific integrity to see it unraveled by an anti-science president. It’s worth fighting for.”


The Irreversible Momentum of Clean Energy

17 January, 2017

The president of the US recently came out with an article in Science. It’s about climate change and clean energy:

• Barack Obama, The irreversible momentum of clean energy, Science, 13 January 2017.

Since it’s open-access, I’m going to take the liberty of quoting the whole thing, minus the references, which provide support for a lot of his facts and figures.

The irreversible momentum of clean energy

The release of carbon dioxide (CO2) and other greenhouse gases (GHGs) due to human activity is increasing global average surface air temperatures, disrupting weather patterns, and acidifying the ocean. Left unchecked, the continued growth of GHG emissions could cause global average temperatures to increase by another 4°C or more by 2100 and by 1.5 to 2 times as much in many midcontinent and far northern locations. Although our understanding of the impacts of climate change is increasingly and disturbingly clear, there is still debate about the proper course for U.S. policy — a debate that is very much on display during the current presidential transition. But putting near-term politics aside, the mounting economic and scientific evidence leave me confident that trends toward a clean-energy economy that have emerged during my presidency will continue and that the economic opportunity for our country to harness that trend will only grow. This Policy Forum will focus on the four reasons I believe the trend toward clean energy is irreversible.

ECONOMIES GROW, EMISSIONS FALL

The United States is showing that GHG mitigation need not conflict with economic growth. Rather, it can boost efficiency, productivity, and innovation. Since 2008, the United States has experienced the first sustained period of rapid GHG emissions reductions and simultaneous economic growth on record. Specifically, CO2 emissions from the energy sector fell by 9.5% from 2008 to 2015, while the economy grew by more than 10%. In this same period, the amount of energy consumed per dollar of real gross domestic product (GDP) fell by almost 11%, the amount of CO2 emitted per unit of energy consumed declined by 8%, and CO2 emitted per dollar of GDP declined by 18%.

The importance of this trend cannot be overstated. This “decoupling” of energy sector emissions and economic growth should put to rest the argument that combatting climate change requires accepting lower growth or a lower standard of living. In fact, although this decoupling is most pronounced in the United States, evidence that economies can grow while emissions do not is emerging around the world. The International Energy Agency’s (IEA’s) preliminary estimate of energy related CO2 emissions in 2015 reveals that emissions stayed flat compared with the year before, whereas the global economy grew. The IEA noted that “There have been only four periods in the past 40 years in which CO2 emission levels were flat or fell compared with the previous year, with three of those — the early 1980s, 1992, and 2009 — being associated with global economic weakness. By contrast, the recent halt in emissions growth comes in a period of economic growth.”

At the same time, evidence is mounting that any economic strategy that ignores carbon pollution will impose tremendous costs to the global economy and will result in fewer jobs and less economic growth over the long term. Estimates of the economic damages from warming of 4°C over preindustrial levels range from 1% to 5% of global GDP each year by 2100. One of the most frequently cited economic models pins the estimate of annual damages from warming of 4°C at ~4% of global GDP, which could lead to lost U.S. federal revenue of roughly $340 billion to $690 billion annually.

Moreover, these estimates do not include the possibility of GHG increases triggering catastrophic events, such as the accelerated shrinkage of the Greenland and Antarctic ice sheets, drastic changes in ocean currents, or sizable releases of GHGs from previously frozen soils and sediments that rapidly accelerate warming. In addition, these estimates factor in economic damages but do not address the critical question of whether the underlying rate of economic growth (rather than just the level of GDP) is affected by climate change, so these studies could substantially understate the potential damage of climate change on the global macroeconomy.

As a result, it is becoming increasingly clear that, regardless of the inherent uncertainties in predicting future climate and weather patterns, the investments needed to reduce emissions — and to increase resilience and preparedness for the changes in climate that can no longer be avoided — will be modest in comparison with the benefits from avoided climate-change damages. This means, in the coming years, states, localities, and businesses will need to continue making these critical investments, in addition to taking common-sense steps to disclose climate risk to taxpayers, homeowners, shareholders, and customers. Global insurance and reinsurance businesses are already taking such steps as their analytical models reveal growing climate risk.

PRIVATE-SECTOR EMISSIONS REDUCTIONS

Beyond the macroeconomic case, businesses are coming to the conclusion that reducing emissions is not just good for the environment — it can also boost bottom lines, cut costs for consumers, and deliver returns for shareholders.

Perhaps the most compelling example is energy efficiency. Government has played a role in encouraging this kind of investment and innovation. My Administration has put in place (i) fuel economy standards that are net beneficial and are projected to cut more than 8 billion tons of carbon pollution over the lifetime of new vehicles sold between 2012 and 2029 and (ii) 44 appliance standards and new building codes that are projected to cut 2.4 billion tons of carbon pollution and save $550 billion for consumers by 2030.

But ultimately, these investments are being made by firms that decide to cut their energy waste in order to save money and invest in other areas of their businesses. For example, Alcoa has set a goal of reducing its GHG intensity 30% by 2020 from its 2005 baseline, and General Motors is working to reduce its energy intensity from facilities by 20% from its 2011 baseline over the same timeframe. Investments like these are contributing to what we are seeing take place across the economy: Total energy consumption in 2015 was 2.5% lower than it was in 2008, whereas the economy was 10% larger.

This kind of corporate decision-making can save money, but it also has the potential to create jobs that pay well. A U.S. Department of Energy report released this week found that ~2.2 million Americans are currently employed in the design, installation, and manufacture of energy-efficiency products and services. This compares with the roughly 1.1 million Americans who are employed in the production of fossil fuels and their use for electric power generation. Policies that continue to encourage businesses to save money by cutting energy waste could pay a major employment dividend and are based on stronger economic logic than continuing the nearly $5 billion per year in federal fossil-fuel subsidies, a market distortion that should be corrected on its own or in the context of corporate tax reform.

MARKET FORCES IN THE POWER SECTOR

The American electric-power sector — the largest source of GHG emissions in our economy — is being transformed, in large part, because of market dynamics. In 2008, natural gas made up ~21% of U.S. electricity generation. Today, it makes up ~33%, an increase due almost entirely to the shift from higher-emitting coal to lower-emitting natural gas, brought about primarily by the increased availability of low-cost gas due to new production techniques. Because the cost of new electricity generation using natural gas is projected to remain low relative to coal, it is unlikely that utilities will change course and choose to build coal-fired power plants, which would be more expensive than natural gas plants, regardless of any near-term changes in federal policy. Although methane emissions from natural gas production are a serious concern, firms have an economic incentive over the long term to put in place waste-reducing measures consistent with standards my Administration has put in place, and states will continue making important progress toward addressing this issue, irrespective of near-term federal policy.

Renewable electricity costs also fell dramatically between 2008 and 2015: the cost of electricity fell 41% for wind, 54% for rooftop solar photovoltaic (PV) installations, and 64% for utility-scale PV. According to Bloomberg New Energy Finance, 2015 was a record year for clean energy investment, with those energy sources attracting twice as much global capital as fossil fuels.

Public policy — ranging from Recovery Act investments to recent tax credit extensions — has played a crucial role, but technology advances and market forces will continue to drive renewable deployment. The levelized cost of electricity from new renewables like wind and solar in some parts of the United States is already lower than that for new coal generation, without counting subsidies for renewables.

That is why American businesses are making the move toward renewable energy sources. Google, for example, announced last month that, in 2017, it plans to power 100% of its operations using renewable energy — in large part through large-scale, long-term contracts to buy renewable energy directly. Walmart, the nation’s largest retailer, has set a goal of getting 100% of its energy from renewables in the coming years. And economy-wide, solar and wind firms now employ more than 360,000 Americans, compared with around 160,000 Americans who work in coal electric generation and support.

Beyond market forces, state-level policy will continue to drive clean-energy momentum. States representing 40% of the U.S. population are continuing to move ahead with clean-energy plans, and even outside of those states, clean energy is expanding. For example, wind power alone made up 12% of Texas’s electricity production in 2015 and, at certain points in 2015, that number was >40%, and wind provided 32% of Iowa’s total electricity generation in 2015, up from 8% in 2008 (a higher fraction than in any other state).

GLOBAL MOMENTUM

Outside the United States, countries and their businesses are moving forward, seeking to reap benefits for their countries by being at the front of the clean-energy race. This has not always been the case. A short time ago, many believed that only a small number of advanced economies should be responsible for reducing GHG emissions and contributing to the fight against climate change. But nations agreed in Paris that all countries should put forward increasingly ambitious climate policies and be subject to consistent transparency and accountability requirements. This was a fundamental shift in the diplomatic landscape, which has already yielded substantial dividends. The Paris Agreement entered into force in less than a year, and, at the follow-up meeting this fall in Marrakesh, countries agreed that, with more than 110 countries representing more than 75% of global emissions having already joined the Paris Agreement, climate action “momentum is irreversible”. Although substantive action over decades will be required to realize the vision of Paris, analysis of countries’ individual contributions suggests that meeting mediumterm respective targets and increasing their ambition in the years ahead — coupled with scaled-up investment in clean-energy technologies — could increase the international community’s probability of limiting warming to 2°C by as much as 50%.

Were the United States to step away from Paris, it would lose its seat at the table to hold other countries to their commitments, demand transparency, and encourage ambition. This does not mean the next Administration needs to follow identical domestic policies to my Administration’s. There are multiple paths and mechanisms by which this country can achieve — efficiently and economically — the targets we embraced in the Paris Agreement. The Paris Agreement itself is based on a nationally determined structure whereby each country sets and updates its own commitments. Regardless of U.S. domestic policies, it would undermine our economic interests to walk away from the opportunity to hold countries representing two-thirds of global emissions — including China, India, Mexico, European Union members, and others — accountable. This should not be a partisan issue. It is good business and good economics to lead a technological revolution and define market trends. And it is smart planning to set long term emission-reduction targets and give American companies, entrepreneurs, and investors certainty so they can invest and manufacture the emission-reducing technologies that we can use domestically and export to the rest of the world. That is why hundreds of major companies — including energy-related companies from ExxonMobil and Shell, to DuPont and Rio Tinto, to Berkshire Hathaway Energy, Calpine, and Pacific Gas and Electric Company — have supported the Paris process, and leading investors have committed $1 billion in patient, private capital to support clean-energy breakthroughs that could make even greater climate ambition possible.

CONCLUSION

We have long known, on the basis of a massive scientific record, that the urgency of acting to mitigate climate change is real and cannot be ignored. In recent years, we have also seen that the economic case for action — and against inaction — is just as clear, the business case for clean energy is growing, and the trend toward a cleaner power sector can be sustained regardless of near-term federal policies.

Despite the policy uncertainty that we face, I remain convinced that no country is better suited to confront the climate challenge and reap the economic benefits of a low-carbon future than the United States and that continued participation in the Paris process will yield great benefit for the American people, as well as the international community. Prudent U.S. policy over the next several decades would prioritize, among other actions, decarbonizing the U.S. energy system, storing carbon and reducing emissions within U.S. lands, and reducing non-CO2 emissions.

Of course, one of the great advantages of our system of government is that each president is able to chart his or her own policy course. And President-elect Donald Trump will have the opportunity to do so. The latest science and economics provide a helpful guide for what the future may bring, in many cases independent of near-term policy choices, when it comes to combatting climate change and transitioning to a clean energy economy.


Saving Climate Data (Part 3)

23 December, 2016

You can back up climate data, but how can anyone be sure your backups are accurate? Let’s suppose the databases you’ve backed up have been deleted, so that there’s no way to directly compare your backup with the original. And to make things really tough, let’s suppose that faked databases are being promoted as competitors with the real ones! What can you do?

One idea is ‘safety in numbers’. If a bunch of backups all match, and they were made independently, it’s less likely that they all suffer from the same errors.

Another is ‘safety in reputation’. If a bunch of backups of climate data are held by academic institutes of climate science, and another are held by climate change denying organizations (conveniently listed here), you probably know which one you trust more. (And this is true even if you’re a climate change denier, though your answer may be different than mine.)

But a third idea is to use a cryptographic hash function. In very simplified terms, this is a method of taking a database and computing a fairly short string from it, called a ‘digest’.

740px-cryptographic_hash_function-svg

A good hash function makes it hard to change the database and get a new one with the same digest. So, if the person owning a database computes and publishes the digest, anyone can check that your backup is correct by computing its digest and comparing it to the original.

It’s not foolproof, but it works well enough to be helpful.

Of course, it only works if we have some trustworthy record of the original digest. But the digest is much smaller than the original database: for example, in the popular method called SHA-256, the digest is 256 bits long. So it’s much easier to make copies of the digest than to back up the original database. These copies should be stored in trustworthy ways—for example, the Internet Archive.

When Sakari Maraanen made a backup of the University of Idaho Gridded Surface Meteorological Data, he asked the custodians of that data to publish a digest, or ‘hash file’. One of them responded:

Sakari and others,

I have made the checksums for the UofI METDATA/gridMET files (1979-2015) as both md5sums and sha256sums.

You can find these hash files here:

https://www.northwestknowledge.net/metdata/data/hash.md5

https://www.northwestknowledge.net/metdata/data/hash.sha256

After you download the files, you can check the sums with:

md5sum -c hash.md5

sha256sum -c hash.sha256

Please let me know if something is not ideal and we’ll fix it!

Thanks for suggesting we do this!

Sakari replied:

Thank you so much! This means everything to public mirroring efforts. If you’d like to help further promoting this Best Practice, consider getting it recognized as a standard when you do online publishing of key public information.

1. Publishing those hashes is already a major improvement on its own.

2. Publishing them on a secure website offers people further guarantees that there has not been any man-in-the-middle.

3. Digitally signing the checksum files offers the best easily achievable guarantees of data integrity by the person(s) who sign the checksum files.

Please consider having these three steps included in your science organisation’s online publishing training and standard Best Practices.

Feel free to forward this message to whom it may concern. Feel free to rephrase as necessary.

As a separate item, public mirroring instructions for how to best download your data and/or public websites would further guarantee permanence of all your uniquely valuable science data and public contributions.

Right now we should get this message viral through the government funded science publishing people. Please approach the key people directly – avoiding the delay of using official channels. We need to have all the uniquely valuable public data mirrored before possible changes in funding.

Again, thank you for your quick response!

There are probably lots of things to be careful about. Here’s one. Maybe you can think of more, and ways to deal with them.

What if the data keeps changing with time? This is especially true of climate records, where new temperatures and so on are added to a database every day, or month, or year. Then I think we need to ‘time-stamp’ everything. The owners of the original database need to keep a list of digests, with the time each one was made. And when you make a copy, you need to record the time it was made.


Azimuth Backup Project (Part 2)

20 December, 2016


azimuth_logo

I want to list some databases that are particularly worth backing up. But to do this, we need to know what’s already been backed up. That’s what this post is about.

Azimuth backups

Here is information as of now (21:45 GMT 20 December 2016). I won’t update this information. For up-to-date information see

Azimuth Backup Project: Issue Tracker.

For up-to-date information on the progress of each of individual databases listed below, click on my summary of what’s happening now.

Here are the databases that we’ve backed up:

• NASA GISTEMP website at http://data.giss.nasa.gov/gistemp/downloaded by Jan and uploaded to Sakari’s datarefuge server.

• NOAA Carbon Dioxide Information Analysis Center (CDIAC) data at ftp.ncdc.noaa.gov/pub/data/paleo/cdiac.ornl.gov-pub — downloaded by Jan and uploaded to Sakari’s datarefuge server.

• NOAA Carbon Tracker website at http://www.esrl.noaa.gov/psd/data/gridded/data.carbontracker.htmldownloaded by Jan, uploaded to Sakari’s datarefuge server.

These are still in progress, but I think we have our hands on the data:

• NOAA Precipitation Frequency Data at http://hdsc.nws.noaa.gov/hdsc/pfds/ and ftp://hdsc.nws.noaa.gov/pubdownloaded by Borislav, not yet uploaded to Sakari’s datarefuge server.

• NOAA Carbon Dioxide Information Analysis Center (CDIAC) website at http://cdiac.ornl.govdownloaded by Jan, uploaded to Sakari’s datarefuge server, but there’s evidence that the process was incomplete.

• NOAA website at https://www.ncdc.noaa.govdownloaded by Jan, who is now attempting to upload it to Sakari’s datarefuge server.

• NOAA National Centers for Environmental Information (NCEI) website at https://www.ncdc.noaa.govdownloaded by Jan, who is now attempting to upload it to Sakari’s datarefuge server, but there are problems.

• Ocean and Atmospheric Research data at ftp.oar.noaa.gov — downloaded by Jan, now attempting to upload it to Sakari’s datarefuge server.

• NOAA NCEP/NCAR Reanalysis ftp site at ftp.cdc.noaa.gov/Datasets/ncep.reanalysis/ — downloaded by Jan, now attempting to upload it to Sakari’s datarefuge server.

I think we’re getting these now, more or less:

• NOAA National Centers for Environmental Information (NCEI) ftp site at ftp://eclipse.ncdc.noaa.gov/pub/ — in the process of being downloaded by Jan, “Very large. May be challenging to manage with my facilities”.

• NASA Planetary Data System (PDS) data at https://pds.nasa.govin the process of being downloaded by Sakari.

• NOAA tides and currents products website at https://tidesandcurrents.noaa.gov/products.html, which includes the sea level trends data at https://tidesandcurrents.noaa.gov/sltrends/sltrends.htmlJan is downloading this.

• NOAA National Centers for Environmental Information (NCEI) satellite datasets website at https://www.ncdc.noaa.gov/data-access/satellite-data/satellite-data-access-datasetsJan is downloading this.

• NASA JASON3 sea level data at http://sealevel.jpl.nasa.gov/missions/jason3/Jan is downloading this.

• U.S. Forest Service Climate Change Atlas website at http://www.fs.fed.us/nrs/atlas/Jan is downloading this.

• NOAA Global Monitoring Division website at http://www.esrl.noaa.gov/gmd/dv/ftpdata.htmlJan is downloading this.

• NOAA Global Monitoring Division ftp data at aftp.cmdl.noaa.gov/ — Jan is downloading this.

• NOAA National Data Buoy Center website at http://www.ndbc.noaa.gov/Jan is downloading this.

• NASA-ESDIS Oak Ridge National Laboratory Distributed Active Archive (DAAC) on Biogeochemical Dynamics at https://daac.ornl.gov/get_data.shtmlJan is downloading this.

• NASA-ESDIS Oak Ridge National Laboratory Distributed Active Archive (DAAC) on Biogeochemical Dynamics website at https://daac.ornl.gov/Jan is downloading this.

Other backups

Other backups are listed at

The Climate Mirror Project, https://climate.daknob.net/.

This nicely provides the sizes of various backups, and other useful information. Some are ‘signed and verified’ with cryptographic keys, but I’m not sure exactly what that means, and the details matter.

About 90 databases are listed here, along with some size information and some information about whether people have already backed them up or are in process:

Gov. Climate Datasets (Archive). (Click on the tiny word “Datasets” at the bottom of the page!)


azimuth_logo


Azimuth Backup Project (Part 1)

16 December, 2016


azimuth_logo

This blog page is to help organize the Azimuth Environmental Data Backup Project, or Azimuth Backup Project for short. This is part of the larger but decentralized, frantic and somewhat disorganized project discussed elsewhere:

Saving Climate Data (Part 2), Azimuth, 15 December 2016.

Here I’ll just say what we’re doing at Azimuth.

Jan Galkowski is a statistician and engineer at Akamai Technologies, a company in Cambridge Massachusetts whose content delivery network is one of the world’s largest distributed computing platforms, responsible for serving at least 15% of all web traffic. He has begun copying some of the publicly accessible US government climate databases. On 11 December he wrote:

John, so I have just started trying to mirror all of CDIAC [the Carbon Dioxide Information Analysis Center]. We’ll see. I’ll put it in a tarball, and then throw it up on Google. It should keep everything intact. Using WinHTTrack. I have coordinated with Eric Holthaus via Twitter, creating, per your suggestion, a new personal account which I am using exclusively to follow the principals.

Once CDIAC is done, and checked over, I’ll move on to other sites.

There are things beyond our control, such as paper records, or records which are online but are not within visibility of the public.

Oh, and I’ve formally requested time off from work for latter half of December so I can work this on vacation. (I have a number of other projects I want to work in parallel, anyway.)

By 14 December he was wanting some more storage space. He asked David Tanzer and me:

Do either of you have a large Google account, or the “unlimited storage” option at Amazon?

I’m using WebDrive, a commercial product. What I’m (now) doing is defining an FTP map at a .gov server, and then a map to my Amazon Cloud Drive. I’m using Windows 7, so these appear as standard drives (or mounts, in *nix terms). I navigate to an appropriate place on the Amazon Drive, and then I proceed to copy from .gov to Amazon.

There is no compression, and, in order to be sure I don’t abuse the .gov site, I’m deliberately passing this over a wireless network in my home, which limits the transfer rate. If necessary, and if the .gov site permits, I could hardwire the workstation to our FIOS router and get appreciably faster transfer. (I often do that for large work files.)

The nice thing is I get to work from home 3 days a week, so I can keep an eye on this. And I’m taking days off just to do this.

I’m thinking about how I might get a second workstation in the act.

The Web sites themselves I’m downloading, as mentioned, using HTTrack. I intended to tarball-up the site structure and then upload to Amazon. I’m still working on CDIAC at ORNL. For future sites, I’m going to try to get HTTrack to mirror directly to Amazon using one of the mounts.

I asked around for more storage space, and my request was kindly answer by Scott Maxwell. Scott lives in Pasadena California and he used to work for NASA: he even had a job driving a Mars rover! He is now a site reliability engineer at Google, and he works on Google Drive. Scott is setting up a 10-terabyte account on Google Drive, which Jan and others will be able to use.

Meanwhile, Jan noticed some interesting technical problems: for some reason WebDrive is barely using the capacity of his network connection, so things are moving much more slowly than they could in theory.

Most recently, Sakari Maaranen offered his assistance. Sakari is a systems architect at Ubisecure, a firm in Finland that specializes in identity management, advanced user authentication, authorization, single sign-on, and federation. He wrote:

I have several terabytes worth in Helsinki (can get more) and a gigabit connection. I registered my offer but they [the DataRefuge people] didn’t reply though. I’m glad if that means you have help already and don’t need a copy in Helsinki.

I replied saying that the absence of a reply probably means that they’re overwhelmed by offers of help and are struggling to figure out exactly what to do. Scott said:

Hey, Sakari! Thank you for the generous offer!

I’m setting these guys up with Google Drive storage, as at least a short-term solution.

IMHO, our first order of business is just to get a copy of the data into a location we control—one that can’t easily be taken away from us. That’s the rationale for Google Drive: it fits into Jan’s existing workflow, so it’s the lowest-friction path to getting a copy of the data that’s under our control.

How about if I propose this: we let Jan go ahead with the plan of backing up the data in Drive. Then I’ll look evaluate moving it from there to whatever other location we come up with. (Or copying instead of moving! More copies is better. :-) How does that sound to you?

I admit I haven’t gotten as far as thinking about Web serving at all—and it’s not my area of expertise anyway. Maybe you’d be kind enough to elaborate on your thoughts there.

Sakari responded with some information about servers. In late January, U. C. Riverside may help me with this—until then they are busy trying to get more storage space, for wholly unrelated reasons. But right now it seems the main job is to identify important data and get it under our control.

There are a lot of ways you could help.

Computer skills. Personally I’m not much use with anything technical about computers, but the rest of the Azimuth Data Backup gang probably has technical questions that some of you out there could answer… so, I encourage discussion of those questions. (Clearly some discussions are best done privately, and at some point we may encounter unfriendly forces, but this is a good place for roaming experts to answer questions.)

Security. Having a backup of climate data is not very useful if there are also fake databases floating around and you can’t prove yours is authentic. How can we create a kind of digital certificate that our database matches what was on a specific website at a specific time? We should do this if someone here has the expertise.

Money. If we wind up wanting to set up a permanent database with a nice front end, accessible from the web, we may want money. We could do a Kickstarter campaign. People may be more interested in giving money now than later, unless the political situation immediately gets worse after January 20th.

Strategy. We should talk a bit about what to do next, though too much talk tends to prevent action. Eventually, if all goes well, our homegrown effort will be overshadowed by others, at least in sheer quantity. About 3 hours ago Eric Holthaus tweeted “we just got a donation of several petabytes”. If it becomes clear that others are putting together huge, secure databases with nice front ends, we can either quit or—better—cooperate with them, and specialize on something we’re good at and enjoy.


Saving Climate Data (Part 2)

16 December, 2016

I want to get you involved in the Azimuth Environmental Data Backup Project, so click on that for more. But first some background.

A few days ago, many scientists, librarians, archivists, computer geeks and environmental activists started to make backups of US government environmental databases. We’re trying to beat the January 20th deadline just in case.

Backing up data is always a good thing, so there’s no point in arguing about politics or the likelihood that these backups are needed. The present situation is just a nice reason to hurry up and do some things we should have been doing anyway.

As of 2 days ago the story looked like this:

Saving climate data (Part 1), Azimuth, 13 December 2016.

A lot has happened since then, but much more needs to be done. Right now you can see a list of 90 databases to be backed up here:

Gov. Climate Datasets (Archive). (Click on the tiny word “Datasets” at the bottom of the page!)

Despite the word ‘climate’, the scope includes other environmental databases, and rightly so. Here is a list of databases that have been backed up:

The Climate Mirror Project.

By going here and clicking “Start Here to Help”:

Climate Mirror.

you can nominate a dataset for rescue, claim a dataset to rescue, let everyone know about a data rescue event, or help in some other way (which you must specify). There’s also other useful information on this page, which was set up by Nick Santos.

The overall effort is being organized by the Penn Program in the Environmental Humanities, or ‘PPEHLab’ for short, headed by Bethany Wiggin. If you want to know what’s going on, it helps to look at their blog:

DataRescuePenn.

However, the people organizing the project are currently overwhelmed with offers of help! People worldwide are proceeding to take action in a decentralzed way! So, everything is a bit chaotic, and nobody has an overall view of what’s going on.

I can’t overstate this: if you think that ‘they’ have a plan and ‘they’ know what’s going on, you’re wrong. ‘They’ is us. Act accordingly.

Here’s a list of news articles, a list of ‘data rescue events’ where people get together with lots of computers and do stuff, and a bit about archives and archivists.

News

Here are some things to read:

• Jason Koebler, Researchers are preparing for Trump to delete government science from the web, Vice, 13 December 2016.

• Brady Dennis, Scientists frantically copying U.S. climate data, fearing it might vanish under Trump, Washington Post, 13 December, 2016. (Also at the Chicago Tribune.)

• Eric Holthaus, Why I’m trying to preserve federal climate data before Trump takes office, Washington Post, 13 December 2016.

• Nicole Mortillaro, U of T heads ‘guerrilla archiving event’ to preserve climate data ahead of Trump presidency, CBC News, 14 December 2016.

• Audie Kornish and Eric Holthaus, Scientists race to preserve climate change data before Trump takes office, All Things Considered, National Public Radio, 14 December 2016.

Data rescue events

There’s one in Toronto:

Guerrilla archiving event, 10 am – 4 pm EST, Saturday 17 December 2016. Location: Bissell Building, 4th Floor, 140 St. George St. University of Toronto.

There will be one in Philadelphia:

DataRescuePenn Data Harvesting, Friday–Saturday 13–14 January 2017. Location: not determined yet, probably somewhere at the University of Pennsylvania, Philadelphia.

I hear there will also be events in New York City and Los Angeles, but I don’t know details. If you do, please tell me!

Archives and archivists

Today I helped catalyze a phone conversation between Bethany Wiggin, who heads the PPEHLab, and Nancy Beaumont, head of the Society of American Archivists. Digital archivists have a lot of expertise in saving information, so their skills are crucial here. Big wads of disorganized data are not very useful.

In this conversation I learned that some people are already in contact with the Internet Archive. This archive always tries to save US government websites and databases at the end of each presidential term. Their efforts are not limited to environmental data, and they save not only webpages but entire databases, e.g. data in ftp sites. You can nominate sites to be saved here:

• Internet Archive, End of Presidential Term Harvest 2016.

For more details read this:

• Internet Archive blog, Preserving U.S. Government Websites and Data as the Obama Term Ends, 15 December 2016.