Help, too many rows with 0

0

Hi,

My web site back end creates a transaction export csv file but it has a lot of columns.  Actually, 240 columns.   I only need information from 10 or so columns but it is a pain to constantly navigate the columns and the back end does not allow column selection. 

In an effort to make this an easier process.

I created a new Excel file with three sheets. Summary, Filtered and Data.    The CSV file will be updated and downloaded often with new data. 

So I used the Excel Import function to create a connection and imported the CSV file into the Data Sheet.    From the Data Sheet I copied the columns I wanted to view and pasted the link into the Filtered Sheet. 

  Then in the Summary Sheet I created an =SUBTOTAL (9, Filtered! J2:J300) to total the filtered columns.

My new spreadsheet worked great except, the file size was over 115 MB.  This file size caused a very slow start.  I couldn't figure out the issue until I unfiltered the Filtered Sheet and had 1,048,576 rows and my CSV file only has 200 rows.  The additional rows had 0's in each row.  So a lot of unnecessary rows.  I deleted the rows and the file size was reduced to 2.6 MB.  But, as soon as the CSV file was refreshed all the 0 rows returned. 

I formatted the cells with a 0;-0;;@ to customized format and it removed all the unnecessary 0's but also removed all the other formatting.    

I've tried many other steps but still stuck.  Any help would be greatly appreciated.

Thank you

Answer
Discuss

Answers

0
Selected Answer

Formatting doesn't remove data. It only displays them differently. The problem either hails from when the CSV file is created or when it is converted to Excel. Open the CSV file in a text editor. Look at the end, if there are blank lines, probably represented by commas or zeroes or even blanks. For good measure also look at the end of the lines to see if blank columns are added. Each column is represented by a comma.

If the CSV file contains only data, as it should, the fault must be with the process of importing it into Excel.. Somehow Excel would get the notion that it must fill up all of its cells. Note that this might be true horizontally as well as vertically. Closely look at the dialog box that opens when you import the CSV file, in particular the separators that are specified. A CSV file is comma-separated. The effect you observe may be caused by specifying additional separators. I lack the experience to do more than guess but it appears plausible to me that there should be a setting which suppresses the creation of blank cells such as you experience, and that setting must be in the import dialog.

If all of this fails here is code to remove the blanks after they were created. You will see that the circumstances for which this code was created match the first scenario in that the blanks are physically present in the CSV file. Frankly, if that were so in your case I would suggest to look at what can be done to the TXT file rather than dealing with the problem after import. You can edit it using NotePad or VBA. However, as a stop-gap the code might be useful for you.

Please try this code instead of the system you have been using. Install it in a standard module of your workbook and run it after you have updated the Data sheet.

Sub GetColumn()
    ' 14 Apr 2018
    
    Const InputTabName As String = "Data"
    Const OutputTabName As String = "Filtered"
    Const ColumnList As String = "C, F:G, L, V:Z"
    
    Dim CopyRange As Range
    Dim Rng As Range
    Dim Sp() As String
    Dim Clm As Variant
    Dim Rf As Long, Rl As Long                  ' first row, last row
    Dim C As Long
    
    Sp = Split(ColumnList, ",")
    With Worksheets(InputTabName)
        With .UsedRange
            Rf = .Row
            Rl = Rf + .Rows.Count - 1
        End With
        For C = 0 To UBound(Sp)
            Clm = Split(Sp(C), ":")
            If UBound(Clm) = 0 Then
                ReDim Preserve Clm(1)
                Clm(1) = Clm(0)
            End If
            Clm(0) = Columns(Trim(Clm(0))).Column
            Clm(1) = Columns(Trim(Clm(1))).Column
            Set Rng = Range(.Cells(Rf, CLng(Clm(0))), .Cells(Rl, CLng(Clm(1))))
            
            If C Then
                Set CopyRange = Application.Union(CopyRange, Rng)
            Else
                Set CopyRange = Rng
            End If
        Next C
    End With
    
    CopyRange.Copy Destination:=Worksheets(OutputTabName).Cells(1, 1)
End Sub
It's a no-frills macro. Before use, please make sure that the Filtered sheet is blank. The names of the Input and Output tabs must be spelled correctly. Use the constant ColumnList to specify the columns you wish to lift. Follow the format (comma separated) of the example. To run the code place the cursor anywhere in the code and press F5 (or Run > Run Macro).

To install the code, press Alt+F11 to open the VB Editor. Right-click on the workbook's name which you see in the Project Explorer window on the top left. Select Insert and Module. Paste the code in the blank pane on the right of your screen.

Discuss

Discussion

Hi, Thank you very much for your help.  I checked the CSV file and it appears to be ok.    I also saved the CSV file to an XLSX file and changed the connection in my Summary, Filtered & Data file to that file and still same issue.  The Data Sheet does not display any unnecessary rows.   So, I believe the issue is the formula/ link I used to extract the required columns from the Data Sheet into the Filtered Sheet.  The additional rows only show in the Filtered Sheet not in the Data Sheet.  Im unsure what to do so maybe I need to start over.  What would be the best approach for extracting information from a CSV file that is downloaded often from my web site?    Due to the large number of columns and data I need to filter certain columns by year, order status and date plus create a Summary of total sales, sales by state, sales tax etc.     My goal was to create a simple, fast way to extract the required information from the downloaded CSV file, filter by adjustable criteria and summarize the totals.    So any help would be appreciated. Thank you,
Why Me (rep: 2) Apr 12, '18 at 9:55 am
You wrote, "From the Data Sheet I copied the columns I wanted to view and pasted the link into the Filtered Sheet." This now looks like the source of the problem. What do you mean by "paste the link"? I imagine that you just manually copy/paste the columns. However, if you select the entire column instead of only the used rows, that's where you might paste a million zeroes accidentally depending upon how you do it. Hence my suspicion of the word "link". Could you create a de-personalised copy of your workbook and post it here?
Variatus (rep: 1595) Apr 12, '18 at 8:51 pm
BTW, I wonder if the number of rows in the CSV / Data sheet are always the same.
Variatus (rep: 1595) Apr 12, '18 at 10:58 pm
Hi, Answers to both posts. 1.     The CSV file that I download from my web site will increase in number of rows.    This is the main reason I made the connection to the CSV file.  I wanted a way to link / connect to the continually changing CSV file.  This probably isnt the best way but the only way I thought was best at the time.  2.     I did copy the column(s) completely which I agree may have created the problem.  I pasted the column into the Filtered sheet by right clicking and selecting Paste Link.  However, at the time, I did not realize or believe it would cause such a file size issue and create so many rows with 0.  3.     Yes I can create a file but if the Paste Link function is the cause of the issue then it will be a quick fix.  4.     If the paste link is not the correct method what is the best way to add links to select columns from one sheet to another.    Plus connect to a CSV file that constantly adds more rows. Let me know if you still want the file. Thank you,    
Why Me (rep: 2) Apr 13, '18 at 11:23 am
Setting a link means that you leave the data where they are and work with images of them in your Filtered sheet. Just press OK and the data will be copied. That should end your problem. However please take a look at the suggestion I have added to my answer.
Variatus (rep: 1595) Apr 14, '18 at 1:27 am
Hi I understand your Macro and will give it a try...  But, I didnt understand what you meant I understand the paste / image information you mentioned.  But, where you say Just press OK and the data will be copied. That should end your problem.  I didnt follow what you meant.  Are you referring to the macro or something else? Also, since I must use this file monthly I needed to make sure the Connection would bring in the new data from a new downloaded CSV file.  After refreshing the connection the new data did not appear in the Data Sheet.  I checked the file name / location and it is the same as the original connection fie.  Since that didnt work, I changed the imported file from a CSV to an XLSX file and reestablished the Connection to this new XLSX file with the new downloaded data.    After refreshing the connection the new data did not appear in the Data Sheet.  It appears my attempt to use the Connection part of Excel is not going to work.  So, my paste link function didnt work and now the Connecting doesnt work.  So back to the drawing board.  Once I get rid of the million rows of 0s I will create a link to separate workbooks not use the import function in Excel.  That is unless you have a better suggestion. Thank you for all your help.
Why Me (rep: 2) Apr 16, '18 at 1:54 pm
You are right to question my advice. It wasn't well considered. I thought of it after I posted. What I meant is that a normal Copy/Paste should copy the data for only those rows that matter to you. Of course, your intention is to establish a link, and that was the part I overlooked. Please disregard.
To my current understanding, you download a CSV file and somehow you get the content of this file into your workbook's Data sheet. From there the macro I provided should do the rest of the job. Please try it. The part described above as "somehow" could probably be automated but that should be another day's question. Same goes for any additions to the macro to create totals or otherwise manipulate the extracted data.
Variatus (rep: 1595) Apr 16, '18 at 10:00 pm
Due to the limited number of characters allowed to post I had to break this up into two posts. Thank you for the response and the clarification.  I understand the macro will remove the 0s but I ran into another problem.  I will briefly explain but this may be a new question to post.  .. My intent was to create a spreadsheet to automate this process but it failed.  1) Manually, download a CSV file from my web site.   The constantly changing file has all sales data and I need to extract certain information contained in columns on a monthly basis.     2)  I created a Summary Sheet with Totals of certain criteria in the Filtered Sheet.  The Filtered Sheet pulled certain columns from the Data Sheet but as we discussed my Link from the Data Sheet to the Filtered Sheet added 0s to unused cells thus increasing the file size.  Originally, I thought the file size caused by the 0s was the only issue.  But, not so.  My understanding, the Macro presented would automatically remove the 0s.  Now, other issues are present which I didnt anticipate. 
Why Me (rep: 2) Apr 18, '18 at 10:04 am
3)  The downloaded CSV file was imported into Excel and a Connection was created.  I thought this would allow me to update the original CSV file with new information.  Then, open my multiple sheet file that contained my Summary Sheet, Filter Sheet & Data Sheet.    Then, reestablished the Connection to the original CSV file by refreshing the connection.   But, this does not work.    New data is not present in the multiple sheet. 4)  I tried changing the CSV file to an XLSX file in hopes the CSV file was the issue but the connection does refresh but does not show any new data.    The file shows in the Connection list but I cannot get it to update with new data.  I tried copy / paste information for my CSV file into the data sheet but the column Links I created in the Filtered Sheet breaks and I get an =Ref error.  So, my brief explanation is a little long but if I cant get the Connection to refresh with new data then my whole multiple sheet theory is DOA.  Should I post a new question...? How to get data from one CSV file or XLSX file into a new multiple sheet file that extracts certain criteria so the information can be totaled or filtered.  And, the CSV file or an XLSX is constantly updated.  Either way, thank you so much for your help.
Why Me (rep: 2) Apr 18, '18 at 10:05 am
The new question, if any, is how to update the 'Data' sheet in your multi-sheet workbook. In the "old" way, the data in this sheet are created from a downloaded CSV file. Consider them "arrived".
The current question (as far as I see) deals with how to extract certain columns to a smaller sheet. For this purpose you have two solutions. One is your own which results in extra rows which might best be avoided if you create a link not to the entire column but to a smaller section of it. I also suggested code to download which (untried by me) is supposed to remove the zeroes. The other is the code I wrote (apparently untried by you) whcih doesn't link but copies specified columns from the "Data" sheet with a single click. With this system neither the CVS data in thre 'Data' sheet nor the extract thereof in the 'Filtered' sheet update automatically but you might create a button to do the latter and ask another question to accomplish the former.
Variatus (rep: 1595) Apr 18, '18 at 8:59 pm
 I have 3 choices. I want an automatic solution so the first is out.  The remove zero macro may try that next.  I used the code you wrote and created the macro.  It worked without adding all the zeros.  So a huge move forward.    I ran into two issues. 1.     I wanted the Filtered sheet columns placed in a certain order but the macro code placed them in alphabetical order.   Im assuming there is an easy code fix, if not, I can live with it.  2.     In the Filtered sheet I filter several of the columns.   My original sheet the filter options remained intact every time I updated the Data Sheet but of course this caused the zero issue.  So I tested the new way with the macro. I added in additional rows into the Data Sheet and ran the macro.  The new data appeared in the Filtered Sheet.  However, the filter options did not pick up the new data.  I tried several ways but the only way the filters would pick up the new rows was to delete all data on the filtered sheet and rerun the macro.  Then, reset several column filters.  I moved ahead but still not 100 percent.  This leads to two questions 1.     Can I add a Filter selection option on my Summary Sheet?  If so, how? 2.     Is there another way to bring in the information from the Data Sheet into the Filtered sheet but leave the Filter options intact?
Why Me (rep: 2) Jun 8, '18 at 3:28 pm
Yes, yes and yes. The sequence of columns to be imported could be changed. The filters could either be preserved or code could be added to create new filters. Your having decided on one of the three options is a game changer. That's why I recommended moving to a new thread. In the new thread the base is the existing setup with the code in it (the chosen solution not as end but as the beginning). You would need to create a sanitized workbook and post it. Then amendments can be made to meet more of your requirements.
Variatus (rep: 1595) Jun 9, '18 at 9:05 pm
Hi Various I have to say when I first posted to this site I merely wanted a quick and easy solution to my problem.  Even though I love to learn new stuff I was a little apprehensive about using macros.  I dont know why but I was in a hurry to get the problem behind me not thinking I could use this knowledge to develop new spreadsheets with increased ease of use and function.  So thank you for your interest in my problem and expanding my knowledge of Excel.   I will create a sanitized file and will post it soon. Thanks again
Why Me (rep: 2) Jun 10, '18 at 10:35 am
Add to Discussion

Answer the Question

You must create an account to use the forum. Create an Account or Login