Salesforce.com API Gotchas Part 2: Custom Field Ids

This is the second part in a series (see Part 1, Part 3, and Part 4).


Salesforce.com LogoOne of the many excellent Salesforce.com platform features is allowing one to define typed custom fields on objects. This is a widely used feature of Salesforce.com and it is not uncommon for an organization to define hundreds of custom fields. Using the Salesforce.com SOAP API, one can retrieve the defined custom fields for an object and can sync custom field data using the same procedure as syncing standard field data. Normally this works beautifully. Unfortunately, problems result when custom field names are changed.

Unique Ids for Object Records in Salesforce.com

Subset of Salesforce.com ID Prefix to Object Mapping

ID Prefix Entitiy Type
001 Account
006 Opportunity
003 Contact
00Q Lead
701 Campaign
00V Campaign Member
00D Organization
005 User
00E Profile
00N Custom Field Definition

The problem stems from an inconsistency in the way Salesforce.com treats custom fields compared to other records. To understand the inconsistency, we must first understand how Salesforce.com uniquely identifies all records. In Salesforce.com, every record has a 15-character Id (the 18-character version will be covered in a future post). The first three characters of the Id identify the object type of the record. As examples, a 003 prefix means the record is a Contact, while a 005 prefix means the record is a User, and a 00D prefix means the record is an Organization.

Custom fields, however, are treated differently. While custom fields have 15 character Ids (prefixed by 00N) in Salesforce.com’s database just like object records, this Id is not exposed via the SOAP API. Instead, Salesforce.com provides a unique, but mutable, “API name” that is initially auto-generated from the field name.

Salesforce.com Records and Metadata

The “API name” is one of the many field attributes retrievable using the describeSObjects() API call. Results are returned for all object types in the form of a list of describeSObjectResult objects. (Incidentally, you can get the record Id prefix for an object type by examining the keyPrefix field in a describeSObjectResult)

describeSObjectResult Object

There is a plethora of information in describeSObjectResult, but for now the data we care about is the list of Field objects. Unsurprisingly, there will be one Field object for each field (standard or custom) that is defined for an object type.

Label and Name for 'Annual Revenue' lead standard field

Label and Name for 'Annual Revenue' lead standard field

Field Object

A Field has both a “label” (a user-friendly display name) and a “name” (computer-friendly identifier). As an example:

    For contact standard field "Annual Revenue":
        label = Annual Revenue
        name = AnnualRevenue

Note that the name is usually just the label with spaces and other non-word characters stripped out.

"Shoe Size" custom field definition in Salesforce.com UI

'Shoe Size' custom field definition in Salesforce.com UI

Custom Fields and the API

Custom fields are represented slightly differently. The name is not actually the “Field Name” shown in the Salesforce.com UI (not exposed via the API), but is instead the aforementioned “API Name”. The “API Name” is closely related to the “Field Name” as it is the field name with a special __c suffix (two underscores before the c).

For instance, if you have a custom field to hold a contact’s shoe size, you might call the field “Shoe Size”. Upon field creation, Salesforce.com will set the “Field Name” to Shoe_Size and the “API Name” to Shoe_Size__c. If you then “delete” the field, the name will change to Shoe_Size_del__c. The field will then be placed in your deleted fields list where you can “undelete” the field. Upon undeletion, the name remains Shoe_Size_del__c. You can then delete the field again, which will change its name to Shoe_Size_del_del__c.

Genius.com’s integration with Salesforce Metadata

When a customer integrates Salesforce.com with Genius.com, we download all available records for that customer from several different object types (Contact, Lead, etc). As part of this, it is important that we represent all of the custom fields on those objects. To keep our data consistent with Salesforce.com’s, we need to track all changes that the customer makes to their custom fields. To be specific, I’m referring to changes to the fields themselves, not just changes to specific record field values. Let’s work through some scenarios of how the custom fields for an object can change. (For clarity, let’s just focus on the Contact object.) You may note that some of the behavior we’re giving our hypothetical customer is not necessarily common usage, but it’s all perfectly valid usage of the Salesforce.com custom field model.

First, suppose that Contact has the Shoe_Size__c integer custom field I used earlier. If the customer adds a date custom field LastBoughtNewShoes__c, that change is easy to handle. When we get the fields of the Contact object, we will see Shoe_Size__c and LastBoughtNewShoes__c. We can detect that we already have a field called Shoe_Size__c, so we skip that one. We perform the same check for LastBoughtNewShoes__c and see that we do not yet have that field, so we create that field on Contact.

Problems with renaming a custom field

Now, suppose that the customer decides he doesn’t like the name LastBoughtNewShoes__c and changes it to LastPurchasedNewShoes__c. When we next check the Contact metadata, it will look like LastBoughtNewShoes__c was deleted and LastPurchasedNewShoes__c was added, thus causing us to have to delete all of the values for LastBoughtNewShoes__c and re-download them for LastPurchasedNewShoes__c, even though it’s exactly the same data.

Problems with deleting and undeleting a custom field

The customer then accidentally deletes the shoe size custom field. Realizing his mistake, he then undeletes the custom field. His data is still safe, but the field name will have changed from ShoeSize__c to ShoeSize_del__c. Thus, like when he purposely changed the name of the field, we will delete the old field (and all its values) and create the new field (and have to re-download all the values).

Problems with custom field names and data types

Next, the customer decides he doesn’t like storing the date that his contacts last bought shoes and would instead like to store an estimation of how frequently his contacts buy shoes. So, he deletes LastPurchasedNewShoes__c, which changes its name to LastPurchasedNewShoes_del__c, and creates a new LastPurchasedNewShoes__c. This time, though, instead of creating it as a date field, he creates it as a picklist (aka dropdown) field with values like “1st Quarter”, “2nd Quarter”, “3rd Quarter”, and “4th Quarter”. When we check the metadata for Contact, we will see a LastPurchasedNewShoes__c field that we already have. However, it’s obviously not going to work to try to write “1st Quarter” into a date field. Depending on the language and DBMS used, that might cause an exception to be thrown, a date of 1970-01-01 or 0000-00-00 to be written, or any of a number of other possible failure conditions. It is possible to detect this specific scenario, though, by examining the type of the field. We could see that the type of LastPurchasedNewShoes__c had changed since the last time we checked metadata and use that to delete the old date field and create a new picklist field. There are a few other fields that can provide hints that a field may have changed in a way that requires deleting the old field and recreating it like “calculated”, “nillable” and a few others, but this is not always accurate, as the next example shows.

Problems with deleting a custom field and creating a new field

Next, the customer decides he doesn’t like storing shoe size as the standard US shoe size (11, 12, etc) and would prefer to store the interior volume of the optimal shoe for the contact in cubic centimeters. To simplify matters, assume that the customer did not accidentally delete the shoe size field, so it is still called Shoe_Size__c. This new measurement cannot be calculated from the shoe size, so all new data is going to need to be gathered from contacts for this field. Consequently, the customer deletes the initial Shoe_Size__c field and replaces it with another int field, also called Shoe_Size__c. When we next check metadata, everything will match: the old field and the new field have the same name, type, and other attributes, even though they are semantically different. The correct behavior would be to delete the old field and all its values and create a new field of the same name, but there is no way to figure out what has actually happened from the data available through the API.

The naive solution

On the surface, the problems listed above are minor inconveniences that can be overcome by being conservative when custom field changes are detected. In this conservative approach, we treat all changes as the previously synced field being deleted and a new field being created. This results in all data for the previous field being deleted and data for the new field being inserted. It may seem like this just causes data syncs to take a little longer. However, there are much more important (and unavoidable) effects.

Why the problems can’t be solved with the current API

Genius.com depends on custom fields for more than just data consistency. There are several important parts of our application that depend on custom fields. Our SmartGroups implementation and our marketing automation product allow users to dynamically segment their prospect base based on custom field values.

Back to renamed field example: let us imagine a marketing user wants to automatically send an email to a prospect when it has been 6 months since they last purchased a shoe. The marketer creates an automation workflow with a rule that says If LastBoughtNewShoes__c is greater than six months send "Email 1" and then goes back to browsing awkward family photos. Imagine that a week later the administrator for the marketer’s Salesforce.com organization renames the field in Salesforce. Under the naive solution we delete the custom field and all data. This requires us to invalidate the workflow, since the custom field it depends on no longer exists. Ideally, we would be able to just rename the field and thus leave the workflow untouched. Unfortunately, without an immutable primary key (i.e. the 00N prefixed Id) this is not possible.

The fundamental problem arises because the field name, while technically a primary key, is not a good primary key because it can change. Having a mutable primary key isn’t bad in theory. It can work in a self-contained system where all data can be kept consistent, but it is quite inconvenient in practice. There are a handful of reasons for this (making backed up data harder to correlate with current data, inefficiencies in propagating primary key changes, etc), but the most important one, in this situation, is that Salesforce.com isn’t a self-contained system. As an external system, we have no way of being notified of changes to primary keys.

It’s not just a problem for Genius.com

Interestingly enough, we aren’t the only ones to have noticed this difficulty with custom fields. Salesforce.com also provides functionality to create a web-to-lead form that you can place on your website and which will automatically create a lead in your Salesforce.com organization when someone fills it out. It is useful to include custom field form fields in web-to-lead forms and, naturally, Salesforce.com lets you do this. Here’s some example output of the Salesforce.com web-to-lead form generator.


<input type="text" name="first_name" id="first_name">


<input type="hidden" name="00N600000012345" id="00N600000012345"
value="Collateral (Whitepaper/Booklet)" /><!-- Lead Type -->


<input type="hidden" name="00N6000000ABCDE" id="00N6000000ABCDE"
value="Whitepaper: Sales 2.0 - Faster Sales in a Slower Economy" ><!-- Event Name -->


<input type="hidden" name="00N6000000A1B2C" id="00N6000000A1B2C"
value="" /><!--Search Terms-->


<input type="hidden" name="00N6000000ZXWQD" id="00N6000000ZXWQD"
value="" /><!--Search Engine-->


<input type="hidden" name="00N600000000001" id="00N600000000001"
value="" /><!--Media Source-->

The Salesforce.com application also uses the 00N Id when working with custom fields. For instance, an example of the URL for editing a custom field is: https://na1.salesforce.com/00N30000001N2f9/e.

As mentioned before, 00N is the Id prefix for custom fields. Given this example, it’s reasonable to assume that the developers who created the web-to-lead form generator recognized the inadequacy of relying on custom field names and instead used the “real” primary key for identifying custom fields. This Id behaves like other Salesforce.com Ids: it is unique and never changes, thus making it unique against all future possible records as well. This is the Id we really need, but there is no way to get this Id through the API.

Continue with Part 3.

  • Digg
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • DZone
  • HackerNews
  • LinkedIn
  • Reddit
  • http://developer.force.com Dave Carroll

    Another great post Marshall!

    All the things you mention are definitely items that the salesforce.com API developer needs to be aware of. Adherence to best practices with regard to custom field management would go a long way here.

    Renaming custom fields is fraught with other dangers as well. Of course, we don’t want to prevent anyone from being able to rename a field, especially while they are still developing the application, there is little we can do except put up a warning in the UI when a field is being changed – “Be careful when changing the name as it may affect existing integrations.”

    The best practice around this area is to change the label if you don’t like the field name when you are in production.

    Deleting and undeleting custom fields has even more implications as indicated by the result page that you see after undeleting a custom field –

    “You have successfully undeleted the custom field: Reseller Phone. All the data in the field has been restored as well.

    Changes were made to the field when it was deleted. You should manually reverse these changes in order to return your field to its pre-deletion state.

    * The developer name was changed to Reseller_Phone_del.
    * The field was removed from any layouts that were edited while the field was deleted.
    * If the field was required, it has been changed to not be required.
    * The field was removed from any AppExchange packages that contained it directly.

    It may take several hours before you are able to search for values in the field.”

    Your layouts are not restored, requiredness is lost as well as preserving the “_del” name change you mention.

    I like your example for custom field names and data types as it illustrates a common design error. The date field is ultimately the more useful of date versus picklist. Since the quarter can be calculated, the better design is to leave the date field as your input field and add a formula field to generate the quarter. I understand that this is an illustrative sample, but it is useful for thinking about data and what is stored versus displayed and the dangers of not understanding the implications outlined in your article when making schema changes.

    It’s not just a problem for Genius.com or for salesforce.com. Integrations are reasonably built against a known schema and when that schema changes, whether it’s salesforce.com or MySQL, the integration is going to fail.

    I’m not sure that I understand how having a unique ID (00N) is useful when a customer changes a field type though. Although the ID is unique and immutable, the rest of the meta-data is not, so that the assumption of data type is not helped by the ID.

    Historical side not:
    The first version of the API that was aware of custom fields and custom objects only allowed you to use the ID. The describe call included a label and a custom field ID. This was problematic because an XML tag is not properly formed if it starts with a number. You can also see this when you create a relationship. The relationship name can default to an ID unless you explicitly name it.

    Cheers and keep up the good work.

  • http://www.genius.com Ryan Ausanka-Crues

    Dave-
    Thank you for your insight into Salesforce.com best practices and the evolution of the Salesforce.com Partner API. While it’s useful to be able to rename custom fields, it’s important for those who administer Salesforce.com organizations to understand the implications such an action will have on integrations.

    Unfortunately, as a Salesforce.com partner using the Salesforce.com Partner API to provide a seamless integration to any organization regardless of schema, we have absolutely no control over what our customers do in their Salesforce.com organization. We are not able to exert any influence on organization administrators towards adherence to custom field management best practices. Thus, it would be immensely helpful to be able to track a field using its 00N ID.

    You are correct that having the unique ID (00N) when a field type changes will not change how such an operation is handled. It will, however, give us better insight into what is actually happening by allowing us to distinguish between a field type being changed and an existing field being deleted with a new field being created.

    Thanks again for your insight.

  • http://www.accessintel.com Rob

    Great article. As a new Salesforce.com user, what are your thoughts about the speed of the API calls? I’ve been hearing some horror stories lately – that during peak times, API calls take very long. I’m having a hard time believing Salesforce would let that happen.

  • Marshall Pierce

    Rob,
    Though the performance of the API does degrade slightly during peak hours, it’s never been anything close to a “horror story.” Usually the dip isn’t even noticeable. Like most external systems (which by definition have non-trivial latency), the key to achieving good performance is parallelization.

  • http://www.accessintel.com Rob

    Marshall, thanks for the reply! That’s definitely reassuring.

  • Candyce

    Thanks for this info. We have run into a similar problem with an AppExchange product, and your post helps me understand what might be creating the issue. We use Marketo integrated with SFDC, and we’ve found that when we change the field type of a custom field, that change does not populate to Marketo. For example, we had a field that was set as a checkbox (Marketo reads this data as true or false). We realized later that we needed a picklist in order to make the field required on our landing page registration forms in Marketo. So we naively made the change. But Marketo never picked up that change. The only options it would allow for the field were “true” or “false” even though we’d created 3 options in the picklist. Even the override functionality in Marketo that enables different labels for selections would not work. We had to delete the field in SFDC and create a new field with a different name and set it to picklist in order to be able to use the field in a Marketo form.

    I’ve also noticed that if we delete a field in SFDC, it does not get deleted in Marketo. A Marketo administrator can hide the field, but it’s not completely deleted. The hide function has to be executed on each field individually.

    • http://eng.genius.com Marshall Pierce

      Update: Tom Davidson from Boomi informed me that Marketo does not use Boomi for SFDC integration. They apparently use Boomi for other integrations, but not for SFDC. The original post follows:

      First, let me emphasize that I don’t have any details about how Marketo’s integration with SFDC works aside from the fact that they use Boomi, so any guesses I have about Marketo’s implementation are just that: guesses.

      My hunch is that the problems you’re having with Marketo actually stem from a different source. The reasons listed in this article make it harder than it should be to track changes to your SFDC schema through SFDC’s API, but it is certainly not impossible (our product has done it for 4 years now, including all of the cases you listed). Instead, I think that the problem may stem from Marketo’s use of Boomi as an intermediate layer between Marketo’s software and SFDC. It could be that the data that Boomi exposes isn’t sufficient to detect the field changes that you made. My understanding is that Boomi requires manual mappings to be configured linking external CRM fields to Marketo fields. This works great most of the time but has severe limitations when changes are made in the CRM. Boomi is a great product, but its focus on providing generic connections across systems limits its ability to provide deep integration with specific CRMs.

      At Genius we chose to integrate with SFDC’s API directly rather than using a product like Boomi to act as an intermediary. This provides several advantages, not the least of which is the ability to quickly detect and apply schema changes like the ones you made. Our system automatically handles such changes within minutes.

      Now that we know that Marketo does not use Boomi for SFDC integration, I’m out of ideas as to why they would be unable to keep track of your schema changes. We’ve been able to do this from the very beginning.

  • Tom Davidson

    Marketo does not use Boomi for its integration with SFDC. Please amend your inaccurate post.
    TD

    • http://eng.genius.com Marshall Pierce

      Tom, thanks for the correction. I’d heard reports that Marketo did use Boomi, and some quick searching for ‘marketo boomi salesforce’ returned hits that seemed to validate that assumption, but you know what they say about assumptions…

      Also, it’s interesting to note that while your email is @boomi.com, your IP address is owned by Marketo.

  • Corporate Sigh

    On a scale of 1 to 10 of how bad an API can be, SF’s custom field API is…

    …wait for it…

    F*cking dumb!

    I mean really, what on earth were they thinking? This is some basic level stuff they are screwing up on.

    • Custom fields are useless.

      Agreed.

      And how come the unique ID starts with a number? When is this ever semantic? In what language?

      This makes Salesforce custom fields useless.

  • Pingback: A new Java Salesforce API Library | Team Lazer Beez Blog

  • http://www.youtube.com/MayTheSForceBWithYou Nathan Pepper

    You have just made my day!!!

    This is the exact name for custom fields that report URL parameters use to dynamically generate reports!

    For example:

    /00O?c=ID&c=LN&c=00N600000012345&rt=2&go=Run+Report

    The thing I still can’t figure out is how I’m going to get all the other parameters and their possible values documented so I can fully utilize this functionality. Any suggestions or takers?

    Cheers,

    Nathan

    • Marshall Pierce

      Nathan,
      Can you be more specific about what you’re trying to do? Which other parameters are you referring to?

  • Pingback: Salesforce.com SOAP API Gotchas Part 1 | Team Lazer Beez Blog

  • Pingback: Salesforce.com SOAP API Gotchas Part 3 | Team Lazer Beez Blog