Magical Conversion of Text to Dates – What The Function
Biomass Rules Blog students who were working ahead on last week’s crude oil and corn price comparison likely hit an impasse when downloading the monthly corn price data from USDA. The USDA, National Ag Statistics Service (NASS), Quick Stats monthly data is available in 3-letter months (JAN, FEB, MAR, etc.). The USDA data used in the monthly corn price in the crude oil/corn price comparison was converted to monthly dates using this technique. Data is not created equally. Similar data can arrive on one’s device in multiple forms.
The Department of Energy, EIA crude oil price data downloads quickly, ready to use, in about 5 mouse clicks. The USDA Quick Stats data – one of my favorite datasets available – nearly requires a PhD to operate. Downloading monthly data is not even in chronological order. And to get the months in a meaningful order, it requires a numeric date. We are all working on it. This post shares some PhD data magic. Anyone can do it!
MS Excel comes with a handy function, =DATEVALUE(source), that converts text that looks like a numeric date into a functionally numeric date. Numeric dates allow the date used in calculation and graphs that are time sensitive. Text dates cannot be added or subtracted. The actual conversion is the easy part with the DATEVALUE() function. The more difficult aspect of this process is piecing together an observation-specific date in letters that looks like a numeric date for DATEVALUE() to convert.
The most difficult term in this post is CONCATENATE. What the Function???
This technique allows the contents of multiple cells to be combined in the same cell. It also allows the addition of new information. We are going to use the existing 3-letter month (JAN), and the existing numeric year (2022), and add additional information for the non-existent day of the month data. It is monthly data, so there is no day provided. But Excel expects a month/day/year in some form for its numeric dates, so it works better to add a day. In this case, we will add a “1” for the first of the month.
It works best using the traditional concatenation coding to think of the code string as an equation. As such the operands and cell addresses are easily read by Excel without any qualifying code. The new text added must be separated from the other code by double quotation marks “.
In the screenshot of today’s image, the three components to combine are:
- The 3-letter month, JAN, in Cell B2.
- The additional first day of the month, but it requires double quotes and a ‘space’ character on both sides of the ‘1’ – plus a comma after the 1. Or, ” 1, ” [Remember there is space character sandwiching the ‘1,’].
- The numeric year, 2022, in Cell A2.
The actual coding equation is, =B2&” 1, “&A2.
Easy peasy, right?
The missing piece of information is the concatenation operand (character) is the Shift-7, or “&” character. The ‘=’ sign opens the code string. B2 is the first object. Then the first &. The second component, ” 1, “. Closing off the text string, day, with another ‘&’. Finally, the last object, the year in Cell A2.
The way this spreadsheet is setup Column C contains the concatenation formula. The code is inserted in the row below the Jan 2020 record, but the result of the formula can be seen in Row 2, Column C. This crazy concatenation formula delivers the text, JAN 1, 2020. Now we are ready for the DATEVALUE() step.
The concatenated date in in Cell C2, so in Cell D2, the following formula is entered, =DATEVALUE(C2).
This will return a date with the numeric equivalent of 01/01/2020. The reason the new, numeric date looks like it does, Jan-20, is that a date format was selected in Excel that only displays the month-year. Even though the D2 value begins with three letters, they are only a label for the numeric month for January.
Excel formats and formatting require a separate blog post. The formatting behaves like an output/result of the coding that is actually inside of each cell. When a new format is chosen, it does not change the underlying information. Only what is displayed and visible.
The flip-side of this is that the DATEVALUE(C2) formula depends on the information in Columns A, B, and C. If, once the new numeric date is created, there is a desire to clean up the formative columns, the DATEVALUE() information must be copied into a new column. When the DATEVALUE() date is copied to a new standalone column, it must be pasted as text instead of as a formula. Once the DATEVALUE() dates have been copied – as text – to a new column, Columns A-D can be deleted. The copied date (as text) is still a numeric date. It is simply no longer a coding formula. If this is new information, it is a lot.
Excel does have a concatenating function, CONCAT(). Using it the contents of the parathesis for the above formula would be, =CONCAT(B2,” 1,”,A2). The similarities should be clear. I grew up before Excel had CONTCAT(), so my personal preference is coding the old fashioned way.
Concatenation is a useful tool. I use it for more than text-to-numeric dates. It is especially powerful when using the INDEX(MATCH()) functions together. But that is for a later story.
Comments
Magical Conversion of Text to Dates – What The Function — No Comments
HTML tags allowed in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>