Discussion:
__unicode__ and __str__
Malthe Borch
2008-10-26 00:12:33 UTC
Permalink
hey list, –––

currently the way we insert dynamic values as text into the template
write stream is to first check if its a string at all (it could be an
integer number), and if not, we convert it using ``str``.

we perform this check using ``isinstance`` (first for ``unicode``,
then ``str``); perhaps we should consider checking for ``__unicode__``
and ``__str__``; or, as a rule, always call them if present.

I need to prototype this idea, in particular for speed, but I think it
could open up for some interesting usage; and ``isinstance`` is rather
lame, anyway.

\malthe

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "z3c.pt" group.
To post to this group, send email to ***@googlegroups.com
To unsubscribe from this group, send email to z3c_pt+***@googlegroups.com
For more options, visit this group at http://groups.google.com/group/z3c_pt?hl=en
-~----------~--
Chris McDonough
2008-10-26 06:37:11 UTC
Permalink
+1
Post by Malthe Borch
hey list, –––
currently the way we insert dynamic values as text into the template
write stream is to first check if its a string at all (it could be an
integer number), and if not, we convert it using ``str``.
we perform this check using ``isinstance`` (first for ``unicode``,
then ``str``); perhaps we should consider checking for ``__unicode__``
and ``__str__``; or, as a rule, always call them if present.
I need to prototype this idea, in particular for speed, but I think it
could open up for some interesting usage; and ``isinstance`` is rather
lame, anyway.
\malthe
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "z3c.pt" group.
To post to this group, send email to ***@googlegroups.com
To unsubscribe from this group, send email to z3c_pt+***@googlegroups.com
For more options, visit this group at http://groups.google.com/group/z3c_pt?hl=en
-~----------~----~----~----~------~----~------~--~---
Hanno Schlichting
2008-10-26 13:44:47 UTC
Permalink
Post by Malthe Borch
currently the way we insert dynamic values as text into the template
write stream is to first check if its a string at all (it could be an
integer number), and if not, we convert it using ``str``.
we perform this check using ``isinstance`` (first for ``unicode``,
then ``str``); perhaps we should consider checking for ``__unicode__``
and ``__str__``; or, as a rule, always call them if present.
Didn't we decide to drop support for non-unicode string insertion
already? I was under the impression that we shouldn't deal with encoding
conversions at all inside the machinery. zope.tal and friends have a
clear Unicode-only policy as well.

In that case a simple (pseudo code):

text = 'a'
if isinstance(text, unicode):
_write(text)
else:
_write(unicode(text))

should be enough, shouldn't it? Feel free to experiment with using
__unicode__ instead, but I have a feeling that checking for the
attribute existence, checking if it is a callable and calling it, is
more expensive.

Hanno


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "z3c.pt" group.
To post to this group, send email to ***@googlegroups.com
To unsubscribe from this group, send email to z3c_pt+***@googlegroups.com
For more options, visit this group at http://groups.google.com/group/z3c_pt?hl=en
-~----------~----~----~----~------~----~------~--~---
Malthe Borch
2008-10-26 14:32:57 UTC
Permalink
Post by Hanno Schlichting
Didn't we decide to drop support for non-unicode string insertion
already? I was under the impression that we shouldn't deal with encoding
conversions at all inside the machinery. zope.tal and friends have a
clear Unicode-only policy as well.
There is the odd exception of int; it really does make sense to
support it natively (in my view).
Post by Hanno Schlichting
should be enough, shouldn't it? Feel free to experiment with using
__unicode__ instead, but I have a feeling that checking for the
attribute existence, checking if it is a callable and calling it, is
more expensive.
Here's my usecase:

I have this "error" object, which has a ``messages`` attribute
(containing zero or more errors), but I'd also like it to be useful by
itself; so I'd give the error-object a ``__unicode__``-method and it
work "just work" (perhaps concatenate the errors or something).

Let me test the speed implications; certainly if it has any serious
impact, we can't justify it. Perhaps something like this:

1. isinstance unicode
2. isinstance str (we currently do support this, but maybe shouldn't)
3. getattr __unicode__
4. coerce to str/unicode

This approach should have no performance penalties for the common
case, but it would support my case, too.

\malthe

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "z3c.pt" group.
To post to this group, send email to ***@googlegroups.com
To unsubscribe from this group, send email to z3c_pt+***@googlegroups.com
For more options, visit this group at http://groups.google.com/group/z3c_pt?hl=en
-~----------~----~----~----~------~----~------~--~---
Hanno Schlichting
2008-10-26 14:41:38 UTC
Permalink
Post by Malthe Borch
Post by Hanno Schlichting
Didn't we decide to drop support for non-unicode string insertion
already? I was under the impression that we shouldn't deal with encoding
conversions at all inside the machinery. zope.tal and friends have a
clear Unicode-only policy as well.
There is the odd exception of int; it really does make sense to
support it natively (in my view).
Of course.
Post by Malthe Borch
Post by Hanno Schlichting
should be enough, shouldn't it? Feel free to experiment with using
__unicode__ instead, but I have a feeling that checking for the
attribute existence, checking if it is a callable and calling it, is
more expensive.
1. isinstance unicode
2. isinstance str (we currently do support this, but maybe shouldn't)
3. getattr __unicode__
4. coerce to str/unicode
Hhm, what happens if we always do:

_write(unicode(text))
Post by Malthe Borch
Post by Hanno Schlichting
unicode(1)
u'1'
Post by Malthe Borch
Post by Hanno Schlichting
unicode(u'foo')
u'foo'
... def __unicode__(self):
... return u'unicode'
... def __str__(self):
... return 'string'
...
Post by Malthe Borch
Post by Hanno Schlichting
a = A()
a
<__main__.A object at 0x52230>
Post by Malthe Borch
Post by Hanno Schlichting
unicode(a)
u'unicode'

That should have next to no performance impact and should work in all
cases (except byte-strings).

Hanno

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "z3c.pt" group.
To post to this group, send email to ***@googlegroups.com
To unsubscribe from this group, send email to z3c_pt+***@googlegroups.com
For more options, visit this group at http://groups.google.com/group/z3c_pt?hl=en
-~----------~----~----~----~------~----~------~--~---
Malthe Borch
2008-10-26 14:51:59 UTC
Permalink
Post by Hanno Schlichting
That should have next to no performance impact and should work in all
cases (except byte-strings).
Yes, this should work for the unicode-only version; in theory, the
engine supports an encoded-mode, where it deals with strings. I'd have
to look closer for this one. Let's prototype the idea and see what we
come up with.

\malthe

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "z3c.pt" group.
To post to this group, send email to ***@googlegroups.com
To unsubscribe from this group, send email to z3c_pt+***@googlegroups.com
For more options, visit this group at http://groups.google.com/group/z3c_pt?hl=en
-~----------~----~----~----~------~----~------~--~---
Malthe Borch
2008-10-27 11:22:22 UTC
Permalink
Post by Hanno Schlichting
text = 'a'
_write(text)
_write(unicode(text))
should be enough, shouldn't it? Feel free to experiment with using
__unicode__ instead, but I have a feeling that checking for the
attribute existence, checking if it is a callable and calling it, is
more expensive.
Unfortunately, converting to ``unicode`` instead of ``str`` is very
expensive. This will probably change to some extent in Python 3.0.

Checking for a ``__unicode__`` attribute helps some, but it's still
about 15-20% slowdown on the admittedly not very realistic bigtable
benchmark.

I think we need to depart from this idea.

I did refactor the code generation logic to make it easier to test
these things quickly; it's now located in the ``generation`` module.

\malthe

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "z3c.pt" group.
To post to this group, send email to ***@googlegroups.com
To unsubscribe from this group, send email to z3c_pt+***@googlegroups.com
For more options, visit this group at http://groups.google.com/group/z3c_pt?hl=en
-~----------~----~----~----~------~----~------~--~---

Loading...