Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
path: root/docs/howto/outputting-csv.txt
blob: f341482167b489e53a0d8a6abee31d55e1f65680 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
==========================
Outputting CSV with Django
==========================

This document explains how to output CSV (Comma Separated Values) dynamically
using Django views. To do this, you can either use the Python CSV library or the
Django template system.

Using the Python CSV library
============================

Python comes with a CSV library, :mod:`csv`. The key to using it with Django is
that the :mod:`csv` module's CSV-creation capability acts on file-like objects,
and Django's :class:`~django.http.HttpResponse` objects are file-like objects.

Here's an example::

    import csv
    from django.http import HttpResponse

    def some_view(request):
        # Create the HttpResponse object with the appropriate CSV header.
        response = HttpResponse(content_type='text/csv')
        response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'

        writer = csv.writer(response)
        writer.writerow(['First row', 'Foo', 'Bar', 'Baz'])
        writer.writerow(['Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"])

        return response

The code and comments should be self-explanatory, but a few things deserve a
mention:

* The response gets a special MIME type, :mimetype:`text/csv`. This tells
  browsers that the document is a CSV file, rather than an HTML file. If
  you leave this off, browsers will probably interpret the output as HTML,
  which will result in ugly, scary gobbledygook in the browser window.

* The response gets an additional ``Content-Disposition`` header, which
  contains the name of the CSV file. This filename is arbitrary; call it
  whatever you want. It'll be used by browsers in the "Save as..." dialog, etc.

* Hooking into the CSV-generation API is easy: Just pass ``response`` as the
  first argument to ``csv.writer``. The ``csv.writer`` function expects a
  file-like object, and :class:`~django.http.HttpResponse` objects fit the
  bill.

* For each row in your CSV file, call ``writer.writerow``, passing it an
  iterable object such as a list or tuple.

* The CSV module takes care of quoting for you, so you don't have to worry
  about escaping strings with quotes or commas in them. Just pass
  ``writerow()`` your raw strings, and it'll do the right thing.

.. admonition:: Handling Unicode on Python 2

    Python 2's :mod:`csv` module does not support Unicode input. Since Django
    uses Unicode internally this means strings read from sources such as
    :class:`~django.http.HttpRequest` are potentially problematic. There are a
    few options for handling this:

    * Manually encode all Unicode objects to a compatible encoding.

    * Use the ``UnicodeWriter`` class provided in the `csv module's examples
      section`_.

    * Use the `python-unicodecsv module`_, which aims to be a drop-in
      replacement for :mod:`csv` that gracefully handles Unicode.

    For more information, see the Python documentation of the :mod:`csv` module.

    .. _`csv module's examples section`: https://docs.python.org/library/csv.html#examples
    .. _`python-unicodecsv module`: https://github.com/jdunck/python-unicodecsv

.. _streaming-csv-files:

Streaming large CSV files
~~~~~~~~~~~~~~~~~~~~~~~~~

When dealing with views that generate very large responses, you might want to
consider using Django's :class:`~django.http.StreamingHttpResponse` instead.
For example, by streaming a file that takes a long time to generate you can
avoid a load balancer dropping a connection that might have otherwise timed out
while the server was generating the response.

In this example, we make full use of Python generators to efficiently handle
the assembly and transmission of a large CSV file::

    import csv

    from django.utils.six.moves import range
    from django.http import StreamingHttpResponse

    class Echo(object):
        """An object that implements just the write method of the file-like
        interface.
        """
        def write(self, value):
            """Write the value by returning it, instead of storing in a buffer."""
            return value

    def some_streaming_csv_view(request):
        """A view that streams a large CSV file."""
        # Generate a sequence of rows. The range is based on the maximum number of
        # rows that can be handled by a single sheet in most spreadsheet
        # applications.
        rows = (["Row {}".format(idx), str(idx)] for idx in range(65536))
        pseudo_buffer = Echo()
        writer = csv.writer(pseudo_buffer)
        response = StreamingHttpResponse((writer.writerow(row) for row in rows),
                                         content_type="text/csv")
        response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
        return response

Using the template system
=========================

Alternatively, you can use the :doc:`Django template system </topics/templates>`
to generate CSV. This is lower-level than using the convenient Python :mod:`csv`
module, but the solution is presented here for completeness.

The idea here is to pass a list of items to your template, and have the
template output the commas in a :ttag:`for` loop.

Here's an example, which generates the same CSV file as above::

    from django.http import HttpResponse
    from django.template import loader, Context

    def some_view(request):
        # Create the HttpResponse object with the appropriate CSV header.
        response = HttpResponse(content_type='text/csv')
        response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'

        # The data is hard-coded here, but you could load it from a database or
        # some other source.
        csv_data = (
            ('First row', 'Foo', 'Bar', 'Baz'),
            ('Second row', 'A', 'B', 'C', '"Testing"', "Here's a quote"),
        )

        t = loader.get_template('my_template_name.txt')
        c = Context({
            'data': csv_data,
        })
        response.write(t.render(c))
        return response

The only difference between this example and the previous example is that this
one uses template loading instead of the CSV module. The rest of the code --
such as the ``content_type='text/csv'`` -- is the same.

Then, create the template ``my_template_name.txt``, with this template code:

.. code-block:: html+django

    {% for row in data %}"{{ row.0|addslashes }}", "{{ row.1|addslashes }}", "{{ row.2|addslashes }}", "{{ row.3|addslashes }}", "{{ row.4|addslashes }}"
    {% endfor %}

This template is quite basic. It just iterates over the given data and displays
a line of CSV for each row. It uses the :tfilter:`addslashes` template filter to
ensure there aren't any problems with quotes.

Other text-based formats
========================

Notice that there isn't very much specific to CSV here -- just the specific
output format. You can use either of these techniques to output any text-based
format you can dream of. You can also use a similar technique to generate
arbitrary binary data; see :doc:`/howto/outputting-pdf` for an example.