Arthur de Jong

Open Source / Free Software developer

summaryrefslogtreecommitdiffstats
path: root/docs/ref/files/uploads.txt
blob: 3304363b906252a6670cae50789eba0e09b598a4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
==================================
Uploaded Files and Upload Handlers
==================================

.. module:: django.core.files.uploadedfile
   :synopsis: Classes representing uploaded files.

Uploaded files
==============

.. class:: UploadedFile

During file uploads, the actual file data is stored in :attr:`request.FILES
<django.http.HttpRequest.FILES>`. Each entry in this dictionary is an
``UploadedFile`` object (or a subclass) -- a simple wrapper around an uploaded
file. You'll usually use one of these methods to access the uploaded content:

.. method:: UploadedFile.read()

    Read the entire uploaded data from the file. Be careful with this method:
    if the uploaded file is huge it can overwhelm your system if you try to
    read it into memory. You'll probably want to use ``chunks()`` instead; see
    below.

.. method:: UploadedFile.multiple_chunks(chunk_size=None)

    Returns ``True`` if the uploaded file is big enough to require reading in
    multiple chunks. By default this will be any file larger than 2.5 megabytes,
    but that's configurable; see below.

.. method:: UploadedFile.chunks(chunk_size=None)

    A generator returning chunks of the file. If ``multiple_chunks()`` is
    ``True``, you should use this method in a loop instead of ``read()``.

    In practice, it's often easiest simply to use ``chunks()`` all the time.
    Looping over ``chunks()`` instead of using ``read()`` ensures that large
    files don't overwhelm your system's memory.

Here are some useful attributes of ``UploadedFile``:

.. attribute:: UploadedFile.name

    The name of the uploaded file (e.g. ``my_file.txt``).

.. attribute:: UploadedFile.size

    The size, in bytes, of the uploaded file.

.. attribute:: UploadedFile.content_type

    The content-type header uploaded with the file (e.g. :mimetype:`text/plain`
    or :mimetype:`application/pdf`). Like any data supplied by the user, you
    shouldn't trust that the uploaded file is actually this type. You'll still
    need to validate that the file contains the content that the content-type
    header claims -- "trust but verify."

.. attribute:: UploadedFile.content_type_extra

    A dictionary containing extra parameters passed to the ``content-type``
    header. This is typically provided by services, such as Google App Engine,
    that intercept and handle file uploads on your behalf. As a result your
    handler may not receive the uploaded file content, but instead a URL or
    other pointer to the file. (see `RFC 2388`_ section 5.3).

    .. _RFC 2388: http://www.ietf.org/rfc/rfc2388.txt

.. attribute:: UploadedFile.charset

    For :mimetype:`text/*` content-types, the character set (i.e. ``utf8``)
    supplied by the browser. Again, "trust but verify" is the best policy here.

.. note::

    Like regular Python files, you can read the file line-by-line simply by
    iterating over the uploaded file:

    .. code-block:: python

        for line in uploadedfile:
            do_something_with(line)

    Lines are split using `universal newlines`_. The following are recognized
    as ending a line: the Unix end-of-line convention ``'\n'``, the Windows
    convention ``'\r\n'``, and the old Macintosh convention ``'\r'``.

    .. _universal newlines: https://www.python.org/dev/peps/pep-0278

Subclasses of ``UploadedFile`` include:

.. class:: TemporaryUploadedFile

    A file uploaded to a temporary location (i.e. stream-to-disk). This class
    is used by the
    :class:`~django.core.files.uploadhandler.TemporaryFileUploadHandler`. In
    addition to the methods from :class:`UploadedFile`, it has one additional
    method:

.. method:: TemporaryUploadedFile.temporary_file_path()

    Returns the full path to the temporary uploaded file.

.. class:: InMemoryUploadedFile

    A file uploaded into memory (i.e. stream-to-memory). This class is used
    by the :class:`~django.core.files.uploadhandler.MemoryFileUploadHandler`.

Built-in upload handlers
========================

.. module:: django.core.files.uploadhandler
   :synopsis: Django's handlers for file uploads.

Together the :class:`MemoryFileUploadHandler` and
:class:`TemporaryFileUploadHandler` provide Django's default file upload
behavior of reading small files into memory and large ones onto disk. They
are located in ``django.core.files.uploadhandler``.

.. class:: MemoryFileUploadHandler

File upload handler to stream uploads into memory (used for small files).

.. class:: TemporaryFileUploadHandler

Upload handler that streams data into a temporary file using
:class:`~django.core.files.uploadedfile.TemporaryUploadedFile`.

.. _custom_upload_handlers:

Writing custom upload handlers
==============================

.. class:: FileUploadHandler

All file upload handlers should be subclasses of
``django.core.files.uploadhandler.FileUploadHandler``. You can define upload
handlers wherever you wish.

Required methods
~~~~~~~~~~~~~~~~

Custom file upload handlers **must** define the following methods:

.. method:: FileUploadHandler.receive_data_chunk(raw_data, start)

    Receives a "chunk" of data from the file upload.

    ``raw_data`` is a byte string containing the uploaded data.

    ``start`` is the position in the file where this ``raw_data`` chunk
    begins.

    The data you return will get fed into the subsequent upload handlers'
    ``receive_data_chunk`` methods. In this way, one handler can be a
    "filter" for other handlers.

    Return ``None`` from ``receive_data_chunk`` to short-circuit remaining
    upload handlers from getting this chunk. This is useful if you're
    storing the uploaded data yourself and don't want future handlers to
    store a copy of the data.

    If you raise a ``StopUpload`` or a ``SkipFile`` exception, the upload
    will abort or the file will be completely skipped.

.. method:: FileUploadHandler.file_complete(file_size)

    Called when a file has finished uploading.

    The handler should return an ``UploadedFile`` object that will be stored
    in ``request.FILES``. Handlers may also return ``None`` to indicate that
    the ``UploadedFile`` object should come from subsequent upload handlers.

Optional methods
~~~~~~~~~~~~~~~~

Custom upload handlers may also define any of the following optional methods or
attributes:

.. attribute:: FileUploadHandler.chunk_size

    Size, in bytes, of the "chunks" Django should store into memory and feed
    into the handler. That is, this attribute controls the size of chunks
    fed into ``FileUploadHandler.receive_data_chunk``.

    For maximum performance the chunk sizes should be divisible by ``4`` and
    should not exceed 2 GB (2\ :sup:`31` bytes) in size. When there are
    multiple chunk sizes provided by multiple handlers, Django will use the
    smallest chunk size defined by any handler.

    The default is 64*2\ :sup:`10` bytes, or 64 KB.

.. method:: FileUploadHandler.new_file(field_name, file_name, content_type, content_length, charset, content_type_extra)

    Callback signaling that a new file upload is starting. This is called
    before any data has been fed to any upload handlers.

    ``field_name`` is a string name of the file ``<input>`` field.

    ``file_name`` is the unicode filename that was provided by the browser.

    ``content_type`` is the MIME type provided by the browser -- E.g.
    ``'image/jpeg'``.

    ``content_length`` is the length of the image given by the browser.
    Sometimes this won't be provided and will be ``None``.

    ``charset`` is the character set (i.e. ``utf8``) given by the browser.
    Like ``content_length``, this sometimes won't be provided.

    ``content_type_extra`` is extra information about the file from the
    ``content-type`` header. See :attr:`UploadedFile.content_type_extra
    <django.core.files.uploadedfile.UploadedFile.content_type_extra>`.

    This method may raise a ``StopFutureHandlers`` exception to prevent
    future handlers from handling this file.

.. method:: FileUploadHandler.upload_complete()

    Callback signaling that the entire upload (all files) has completed.

.. method:: FileUploadHandler.handle_raw_input(input_data, META, content_length, boundary, encoding)

    Allows the handler to completely override the parsing of the raw
    HTTP input.

    ``input_data`` is a file-like object that supports ``read()``-ing.

    ``META`` is the same object as ``request.META``.

    ``content_length`` is the length of the data in ``input_data``. Don't
    read more than ``content_length`` bytes from ``input_data``.

    ``boundary`` is the MIME boundary for this request.

    ``encoding`` is the encoding of the request.

    Return ``None`` if you want upload handling to continue, or a tuple of
    ``(POST, FILES)`` if you want to return the new data structures suitable
    for the request directly.