Speaker 1: 00:00:01 On this episode of Edge of the Web.
Dawn Anderson: 00:00:03 We should optimize for the web of document, because the web of document has a huge impact on natural language disambiguation and meeting informational needs, but it’s messy. The web of data is semantics, schemas, OWL, which is the web ontology language, RDF triples, and those kinds of things, and a lot of voice search results and single answers come from data rather than words.
Speaker 1: 00:00:33 Your weekly digital marketing trends with industry trend-setting guests. You’re listening and watching Edge of the Web, winners of best podcast from the Content Marketing Institute for 2017. Hear and see more at Edgeofthewebradio.com. Now here’s your host, Erin Sparks.
Erin Sparks: 00:00:56 All right, this is Edge of the Web radio episode 340. I’m your host, Erin Sparks, CEO of Site Strategics over here in Indianapolis. Every week, we bring you amazing digital marketing professionals and unpack digital marketing trends with those professionals each and every week, as well as the news topics of the day. So you want to make sure you check out edgeofthewebradio.com, where we’re hosting all that content. And if you’re new to the show, welcome. Thanks for jumping in and listening. We really appreciate the newcomers to the show.
Erin Sparks: 00:01:26 Here’s the ropes of the show, what we do each and every week. We go live on YouTube. Sometimes we try to get around 3:00 PM Eastern, but we certainly adjust our time for our foreign guests. Each week we actually start there, and then we actually take that to podcasting on your preferred platforms, so iTunes, Spotify, Google play, Stitcher, iHeartMedia Player FM, all the different podcast platforms. And we also transcribe this show. So if you wanted to see exactly what we were talking about in minute 15:38, you’re going to be able to go to that podcast transcript and jump right there. So all of that content, as well as a number of different blog posts each and every show, are over an edgeofthewebradio.com. If we’re not where you want us to be on the iTunes side of things, or I’m sorry, on the podcast side of things, let us know and we’ll certainly get our feed there.
Erin Sparks: 00:02:17 So Edge of the Web is actually brought to you by Site Strategics. They’re the title sponsor of the show. Site Strategics is an agile digital marketing pioneer inside of the spaces of SEO, technical SEO, social media, conversion optimization, and a number of other digital marketing tactics. So if you’re interested in what agile marketing is, it’s results-based digital marketing that’s really focused on the bottom line.
Erin Sparks: 00:02:40 So if you’re interested in what we can do for you, call us at (877) SEO-4-WEB or (877) 736-4932. I’m going to introduce my studio compadres. Jacob Mann is in the production booth along with Allie Coons. Allie, you’re not eating, right? Okay, so you can actually say something. Guys, how you doing?
Jacob Mann: 00:02:57 Hey, we’re doing well.
Erin Sparks: 00:02:58 Happy Monday to you.
Jacob Mann: 00:02:59 I know. Happy Monday. Good afternoon.
Erin Sparks: 00:03:02 Good afternoon, sir.
Jacob Mann: 00:03:02 Good evening, sometimes, for some people.
Erin Sparks: 00:03:04 Good afternoon, good evening, and good night.
Jacob Mann: 00:03:06 There you go.
Erin Sparks: 00:03:07 Just trying to reach our entire internet. Did you realize that about a third of our audience is out of the EU, and I think 23% of our audience is in Ireland?
Jacob Mann: 00:03:17 I did realize that, but I also look at our statistics a lot.
Erin Sparks: 00:03:20 That’s true.
Jacob Mann: 00:03:21 I would catch that kind of thing.
Erin Sparks: 00:03:22 So how are you doing? Thank you so much. And you’d better believe that we are always wanting to talk to people across the pond. In fact, we’re going to be doing that with one of our guests today.
Erin Sparks: 00:03:30 I do want to go over some housekeeping for you that are paying attention to our upcoming shows. Kim Scott’s going to be here on the 3rd of February. Robert Rose is here on the 10th of February. It’s always good to be able to have … Oh my gosh, literally all four of us are return guests here. We also have Susan Wenograd. She’s going to be here for her third time, as well as Sherry Bonelli on the 24th. So we’ve got a full stack of return guests coming to us on The Edge. And if you’re interested in hearing about somebody, if you want us to meet somebody and interview them, let us know. Just email firstname.lastname@example.org, and we’ll certainly do an outreach. And if you are an aspiring or a seasoned digital marketer, you want to be on the show, give us a shout as well.
Erin Sparks: 00:04:15 By the way, we’re going to be a media partner for Third Door Media over at SMX West in February. If you see us up there, grab us, and we may actually have an impromptu interview then and there. So we actually are meeting partners for SMX West, and we’re going to be cooking up a few pretty cool things here over the next few months.
Erin Sparks: 00:04:35 So, that said, that’s the housekeeping points here, but we do want to talk about SMX real quick, because it is a really cool conference. If you’ve not been out there, we certainly recommend you do that. In fact, Third Door Media, the producers of the show, have provided a great discount if you want to attend that conference. There are so many fantastic speakers. It’s a two-day event. Is it the 18th or 19th? It’s the 19th and 20th. I’m sorry. 19th and 20th? It’s the 19th and 20th, correct? Yeah, I think it is, and we are getting a nod. Yeah. They’re offering 15% off those tickets, and those full day access passes are incredible. There’s so many people to listen to. They’ve got three or four tracks out there of information. So, if you go and register today, all through the entire month, over at SMX West, search online and then you can use this code, Edge15, for 15% off.
Erin Sparks: 00:05:31 A bunch of speakers there that are friends of the show. In fact, Dawn Anderson, today, who’s on the show; Brad Geddes, Tim Jensen, who was just recently on; Elizabeth Marsten; Joe Martinez; Ginny Marvin; Lily Ray; Barry Schwartz; Aleyda Solis; Kirk Williams; Bruce Clay; and many, many more are speaking at SMX West, and we certainly want you to be there and be able to benefit from being an Edge listener as well. So Edge15 at checkout will give you 15% off.
Erin Sparks: 00:05:57 All right. That’s everything today. Oh, yep. Certainly want to talk about ahrefs. ahrefs is our continued sponsor of Edge of the Web Radio. We certainly appreciate them as an ongoing sponsor. I’ll tell you what, if you want to know what your competitors are doing in the SEO space, what they’re ranking for, what pages are bringing in the most traffic, what links they have, ahrefs is a fantastic tool. They’ve been around for well over 10 years, and they keep on evolving their tool. So, you can see a lot of great information, as well as they’ve got a great audit tool as well. Just recently used that on one of our audits here on our clients. But they also have some great insight into link intersect of competitors. There’s so many things in there. So go over to ahrefs.com. A-H-R-E-F-S.com. Sign up for a free trial today and swim in great data, just like we do over here at site. All right. With that, let’s meet this week’s featured guest.
Speaker 1: 00:06:53 Now it’s time for Edge of the Web featured interview with Dawn Anderson, Managing Director at BRTY.
Erin Sparks: 00:07:05 All right, so let’s introduce our guests here across the pond, right. We’ve got Dawn Anderson. She’s the managing director at BRTY. For those of you who don’t know Dawn, shame on you. You really should. She’s been a fantastic international SEO speaker. She is a digital strategist and consultant, speaker, trainer. And lecturer. She’s been doing SCM and marketing analytics at Manchester Metropolitan University as well. She lectures on both digital marketings for digital marketing strategy students at an undergraduate level and international fashion promotion students at the undergraduate level as well. And that’s what we really need. We need authorities that are actually going back into the education space and really raising these kids up right, so to speak.
Erin Sparks: 00:07:56 Dawn, there you are right there. You’ve been speaking on all these circuits, Pubcon, SaSCon, Brighton SEO, SMX London, UK, and San Jose, Mozcon, State of Search. Anything else that I’m missing right now?
Dawn Anderson: 00:08:10 No, I can’t think of much that you’re missing there.
Erin Sparks: 00:08:12 I jammed it all into a pretty …
Dawn Anderson: 00:08:14 Yeah, you pretty much put that into like three sentences.
Erin Sparks: 00:08:18 I know. That’s the coffee. I told you it’s the coffee.
Dawn Anderson: 00:08:20 Well done.
Erin Sparks: 00:08:21 Oh, thank you. I appreciate it. Well, we love to reach out to guests that are leading the way in education and speaking, because there’s so much information that needs to be unpacked and demystified out there, because the environment is changing so much that you have to pump the brakes a little bit and then talk about key concepts. Yeah?
Dawn Anderson: 00:08:45 Yeah. You really do. Things are just evolving so quickly. And I think we were just saying before that it’s really hard to keep up, but, at the same time, you don’t want to end up being so focused on one area that you become almost pigeonholed. As a good SEO, I think you need to have a broad understanding of everything that’s going on in the whole space. Maybe then zoom in and really research a few areas. I don’t think you can be just a specialist in one particular area with the exception of everything else, because that’s not how SEO works. SEO is about understanding everything and how it all fits together, and even going beyond SEO and understanding how digital marketing as a whole works. SEO and [inaudible 00:09:38] specialisms work in a silo. That’s the point.
Erin Sparks: 00:09:43 Would you be so bold as to say that SEO really kind of gets an understanding of content, or I should say a customer intent, earlier than most of the other disciplines in digital marketing?
Dawn Anderson: 00:09:56 Well, I think the SEO tends to have a huge reach element to it, if that’s what you’re talking about, and that a huge amount of queries out there are informational far more than there are transactional. So we have an opportunity to be in huge spaces even before an audience is aware of us as a brad. So I think it presents a massive opportunity. So, yes, I think we have a huge advantage in that we could map things that are very, very early on in the intent stages to content. So, yeah, so the reach of SEO is significant, probably more so than most channels.
Erin Sparks: 00:10:39 And it is really the coolest job in digital marketing.
Dawn Anderson: 00:10:41 It really is, yes. It’s so interesting, particularly when … I think SEOs are becoming increasingly marketers as well. We have to be, because we can’t be kind of hidden in a basement anymore, just spamming.
Erin Sparks: 00:11:02 I got my best work done in the basement. I mean, come on.
Dawn Anderson: 00:11:06 I suppose maybe there still are some people. Sorry. The dog’s leaving.
Erin Sparks: 00:11:14 There’s the dog. All right. Oh no, there’s the Pomeranian. There’s the Pomeranian.
Dawn Anderson: 00:11:17 Sorry. She’s going out for a walk.
Erin Sparks: 00:11:19 So we want to make sure everybody knows Don loves Pomeranians, and we’ve got one on camera there.
Dawn Anderson: 00:11:25 Just on my knee, yeah.
Erin Sparks: 00:11:28 He wanted to jump out of there. But, Dawn, first, if you could kind of go back through your backstory and tell us our and our listeners how you came to be at BRTY. Could you do that for us?
Dawn Anderson: 00:11:38 Yeah, well, I’ve been an SEO for nearly 13 years. Prior to that, I had a completely different career. I had a building company of all things, so completely different to what I do now, and so totally different. I had all these vans on the road and people that worked in my office and whatnot. As I say, I’ve employed people before. And that was really stressful.
Dawn Anderson: 00:12:10 At the time, I kept having websites built for the business, and I had some really wonky websites built by people. It was really quite early on when not every business had a website. And, as I say, I had no clue how they were built. So people just kept building me these awful websites.
Dawn Anderson: 00:12:31 And then I started to hear about SEO, and somebody came to see me to sell me SEO services as a business. They were talking about spiders and crawls, and I really thought they were a bit mad at the time. It was really back in the day.
Erin Sparks: 00:12:47 We get that a lot.
Dawn Anderson: 00:12:49 I mean, gosh, I even get that now when I start talking about crawling and spiders to any family. I kind of don’t do it anymore, because they just think you’re mad. So I don’t talk about my work to family.
Dawn Anderson: 00:13:02 But the point is so I became a little bit interested, and then I actually had a few bad experiences with people who were providing SEO services for me. It was just not good quality and doing dodgy things. And so I started go and learn about it myself and, I found that I became more and more and more and more intrigued with it, more and more intrigued with web development, building sites, optimizing things, partly because, Hey, I didn’t really have a large budget to pay anybody else and, B, I talked about experiences and, C, because I actually became really interested in it all.
Dawn Anderson: 00:13:45 So I got to the point where actually I was more interested in that than the industry I was in. And, over time, I transitioned out of it, went looking for agencies, started working on my own projects as well, and then, eventually, I started working for myself seven years ago, and [Moving Marketing 00:14:03] then became BRTY, which was named after my dog, because I was sick of spelling the domain name Moving Marketing. And also I felt Moving Marketing was … the name tied us into doing certain things, and, over time, you never know which direction you’re going to go in. It could be that we’ll eventually get involved in web development, app development, marketing, all kinds of other things. So I think it was just a good idea to have something simple. So that’s that story. Yeah.
Erin Sparks: 00:14:41 Let me ask you this. As you were starting to develop your SEO, not only for your own company, but as you started to actually eyeball that this could be a career change, did you start going vertical in your particular industry for optimization? Were you selling their optimization services to other building communities?
Dawn Anderson: 00:15:01 Well, no. I’ve not really special … I’d say I’ve not really specialized, but what’s happened is I went working for agencies, and what I found is, thankfully, I I was taught SEO by an information architect. So that was amazing. And I’m still in touch with him now, 13 years later. I was taught it in a building-block style, and I really went more down that technical route than the whole links thing to start with and more about how websites were built and how they were crawled and how … So I was very much more interested in that to start with.
Dawn Anderson: 00:15:47 Then when working with agencies, I found that actually understanding those concepts means that you can pretty much do SEO for any vertical, because the building blocks of the web, the building blocks of crawling, the building blocks of site structure are the same regardless of the industry.
Dawn Anderson: 00:16:09 But, to be fair, what I would say now is that, over time, I have ended up with a proportion of different clients from certain verticals more than others. So I have ended up with quite a few clients that are in the home and garden sector, travel, and fashion’s quite a big one for us now. So I think it becomes one of those things people recommend you to people in their sector. And it’s kind of been quite a natural evolution, but we do work with clients across lots and lots of different verticals really.
Dawn Anderson: 00:16:47 I think if you understand fundamentals of sites crawling the web, you can put your hand to any vertical, apart from on the PR side of things, because I think having an actual knowledge of the conversation, the relationships, the influence, et cetera, I think it does matter if you are a specialist, because you build up a lot of contacts. But that’s PR generally. So those are my thoughts on that.
Erin Sparks: 00:17:21 That is the influential property inside of PR are those relationships and those connections.
Dawn Anderson: 00:17:28 Who do you know, who are the influencers, who speaks to you, who will do content with you, who will co-create, that kind of thing. Yeah.
Erin Sparks: 00:17:39 So as you have been focused on the database structure or the information technology structure, as opposed to some SEOs that get and find their way maybe from other digital marketing disciplines, you have an appreciation of the foundation and the technical nature and the relationship of content, both external and internal. Whenever you look back at some of those SEOs, those dodgy SEOs, as you put it, does it make you just grind your teeth a little bit about what they were getting away with back then that they should have known better?
Dawn Anderson: 00:18:13 Yeah, but I think it’s not okay to blame. I think it’s more a case of the times have changed.
Erin Sparks: 00:18:20 Yeah, you’re absolute right.
Dawn Anderson: 00:18:24 At the time, people could get away with things much more easily. It wasn’t necessarily that it was bad. It was just it wasn’t effective for us. As a business, we found that actually what was happening was that really the services that we were receiving was just spitting out things and just repeating strategies over and over again and just building link networks almost. And I know some of that still goes on, it’s not for me to judge anybody in [inaudible 00:19:02] at all. I’m not going to do.
Dawn Anderson: 00:19:05 But I will say that, as a customer, at the time, well, it annoyed me, because I thought, “Well, this just doesn’t feel right.” And I remember the guy who taught me SEO, he said it to me … I said, “Oh, I’ve started getting some link building done.” And I was very naive at the time. And he said, “Do me a favor, Dawn.” And he knew a lot more than me then. And even now he’s an enterprise information architect, and so occasionally, as I said, I keep in touch. We have a chat from time to time. But he said to me, “Dawn,” but he used to work as head of SEO for a massive agency for years back as well before he was working with me.
Dawn Anderson: 00:19:52 The point is, he said to me, “Dawn, do me a favor. Show me the links of the build for you.” So I sent him the links, and they were all just clearly all people they knew, loads and loads of link networks all just like, “Oh, we’re joining together, creating link wheels,” and so forth. And some of them were just totally irrelevant to our sector, and he said to me, “Would you go to the baker’s to buy a toothbrush?” And that was appalling. They were literally just flogging me anything because I was naive. They were the equivalent of you go to the garage to get your car fixed. Because you don’t know any different, you just go, “Oh yeah, fine.” So that’s how I just ended up doing my own SEO more and more really.
Erin Sparks: 00:20:44 It is important for any owner of business to have that level of insight and a bit of knowledge of what they’re actually buying, because there is a buyers-beware environment still in SEO. It upsets us greatly. I can appreciate your diplomatic point there, but, at the same time, there are some best practices and there are some terrible practices that have literally … that were 10 years ago that are still cropping up. In fact, we oversight Strategics. We just found ourselves disavowing links that a different company was hired to do literally a couple of months back as you inherit the client, and we’re looking at us going, “Okay, there’s still this practice going on.” And you just want to scream sometimes, because it’s so terribly damaging to the the value, not only for the immediate ranking, but Google’s keeping a record, keeping a rap sheet so to speak, of domains, and they have a much longer history, a much, much longer visibility of where these domains have been over time that you pick up these links and it’s not just like scraping barnacles off a hull. There’s a reputation that’s being built.
Dawn Anderson: 00:22:03 I think also, as well, two things. Firstly, it’s very difficult for people to change. If something’s worked in the past, it’s very hard to change your habits because, “Hey, this works, so why don’t we just carry on with what we were doing?” I think there’s that element. People have processes, structures that are built for something that maybe no longer works or works to a limited extent, but they still have a business they feel like. So it’s hard to change.
Dawn Anderson: 00:22:36 But I think also we’re in an age of machine learning now, and search engines certainly I would say have strategies where they are learning. They’re learning the link networks utilizing machine learning, probability, predictability, patterns, modeling, and so forth. So I think trying to utilize some of those strategies in an age of machine learning is probably flogging a dead horse in the long term.
Erin Sparks: 00:23:14 Absolutely. And that really gets us to kind of center mass of some of this conversation here today. And it has to do with the new technologies that are in place, the machine learning, the AI, so to speak, that we’re experiencing. We’ve had a number of different SEOs talk about machine learning and the like, and we’ve really started to circle around regularly voice search and natural language processing.
Erin Sparks: 00:23:39 So for those of you at home that haven’t come across this before, natural language or an NLP, natural language processing, it’s basically oriented towards … the search engines are actually now in line with everyday conversation. They’re starting to understand the patterning and the connectors that make up intent or make up understanding BERT was, not your Bert, Dawn, but BERT just recently, very serendipitous, though. BERT, the algorithm, changed. They got released in the latter part of 2019. It truly was connected to the new patents that Google has regarding understanding these transformers, these agents that actually break apart and understand what your true intent is.
Erin Sparks: 00:24:32 So you’ve been diving in, just like we talked about before, you’ve been focused foundationally in the technical space. This is that next new foundation that all SEOs should start to appreciate, correct?
Dawn Anderson: 00:24:47 Yeah, I think increasingly search engines want to understand how people really speak, rather than the whole SEO text, and particularly [inaudible 00:25:02] move towards this new age of conversational search, because people don’t talk in the same way as we read content in SEO blog posts or that kind of thing. People speak in a natural way. Oddly enough, on that point, actually, when people are speaking to search engines, they actually speak in a slightly unnatural way. If you’ve ever asked Google Assistant or Alexa for something, we have almost this, they call it [keyword-s 00:25:36] way of speaking. So humans actually say things like, “Okay, Google.” I better be careful, actually, because –
Erin Sparks: 00:25:43 You’re activating all the devices in the room.
Dawn Anderson: 00:25:47 I’ve got full Google Homes and Alexa in my house and just realized there’s one on my TV the other day, so I have to whisper it.
Dawn Anderson: 00:25:54 The point is when we ask for information, we’ll speak in a very like, “lease tell us the way to blah blah.” It’s not actually very natural that way. But the point is there’s a huge amount of content on the web. Most of it is unstructured. We have a few different types of content. We have the very structured relational databases and things like data sets. A lot of the websites are … the big websites have a lot of structured content, because they’re driven largely by database systems, like the booking.coms and eCommerce sites. And it kind of all comes out in fields.
Dawn Anderson: 00:26:41 And then you have the semi-structured data, the likes, the knowledge repositories like Wikipedia, and a lot of the informational content, which we’re all producing now that, [inaudible 00:26:52] featured snippets and position zero generally and answer all these questions.
Dawn Anderson: 00:26:57 And then you have this whole hot mess, which is things like the stuff in between all of those structured parts, within the paragraphs, within the blurb, and within things user-generated comments, which are probably the most natural kind of language that you can find on the web, because it has had no influence from copywriters. It’s had no influence from –
Erin Sparks: 00:27:24 Marketers.
Dawn Anderson: 00:27:24 – SEOs, yeah, marketers. It’s just people putting comments on and saying, “Hey, this is what I think,” or, “This is why I’m annoyed,” or, Help me with this,” the likes of core or the likes of Stack Overflow and those kinds of really, really rich and natural conversation.
Dawn Anderson: 00:27:43 So there’s loads of that. User-generated content is massive. It’s part the reason, well, it’s probably the reason for the explosion of the web in terms of content. But to think, if I’m not mistaken, Eric Schmidt said it a few years ago, because every single tweet is a new webpage, for instance. But that’s user-generated content mostly, apart from with the influence of marketers. But, again, even marketers don’t really structure that for SEO. They structure it for impact.
Erin Sparks: 00:28:14 So they’re positioned to try to try to convert, SEOs trying to rank. And the engines are trying to make sense of this and kind of deprioritize those particular manipulations and look towards natural language as the glue between all of this. It’s the unstructured, unprioritized natural way we speak. It just dawned on me as you were speaking is one of the key, almost a key bit of research that I conduct regularly … My house, I’ve got four kids, and we’ve got Alexas all over the bloody place and how they interact with Alexas and how they ask questions. They’re not wired for the keyword-search-query language that we all basically get immersed in utilizing the desktop, utilizing the keyboard. They’re now digital natives beyond anything that we’ve ever seen, and they’re naturally wired at understanding how to get the best results based on some of their intent. And that in its own right could very well be an incubation inside of smart speakers, just listening to the kids.
Dawn Anderson: 00:29:34 Think about it, if you were building a company now or if you were building something for demand, you would probably be building and anticipating the next big wave of demand and that’s the next generation of people that are coming through, so the digital natives, those that have no issue with talking to devices and searching in a different way and multi-devicing and with the short attention spans and even interested in pictures and videos probably as much as they are in reading long pieces of text. So search engines build things to scale and future-looking. So absolutely. Things are not being built for us. They’re being built for the next generation.
Erin Sparks: 00:30:24 You better believe it. And we want to be able to capitalize on that and take a little cut of that for ourselves. But you’re absolutely right. They’re soon to meet those digital natives on their playing field. And, on top of that, I keep on going to the kids, but the digital natives are also accustomed to that single answer, not the multi-answer environment that we have all been striving to be able to optimize for. So there’s a 360 feedback loop that we are experiencing that we’re getting more and more satisfied and more and more critical of that singular answer. So this voice search is a very, very important feature to pay attention to on the input side, not just on the output side, correct?
Dawn Anderson: 00:31:15 Yeah, it’s not very good, though.
Erin Sparks: 00:31:17 No, it’s still not good.
Dawn Anderson: 00:31:17 That’s the problem. No, that’s the problem. There’s a lot of inconsistencies. And, at the minute, I’m actually preparing a deck this week that looks at … We’ve had this era where we’re moving now towards the web of data kind of almost coming to fruition. It’s been a long time coming since … Tim Berners-Lee years ago coined the phrase linked data and talked about the web of data versus the web of document. A lot of us in SEO are still optimizing. We should optimize for the web of document, because the web of document has a huge impact on natural language disambiguation and meeting informational needs, but it’s messy. The web of data is semantics, schemas, OWL, which is the web ontology language, RDF triples, and those kinds of things, and a lot of voice search results and single answers come from data rather than words.
Dawn Anderson: 00:32:25 But the problem is there’s this notion of a semantic heterogeneity. That means that there are databases pulled together by the knowledge graph that actually drags information that is conflicting from different sources to meet an informational need. And I asked on Twitter, and if you get the chance, have a look, about the consistencies that are shown between pictures and featured snippets or answers the are given that are wrong or … knowledge repositories, which drive a lot of the featured snippets on voice search and actually have a contribution to voice search rather, and provide that single factoid answer that have to conflict with each other.
Dawn Anderson: 00:33:18 And somebody only today shared with me that DBpedia and Wikipedia, which are knowledge repositories, well, DBpedia is like a knowledge graph. It’s more structured. The Wikipedia, which is a knowledge repository, which is semi-structured data. If you look at 50 philosophers side by side, one gets the date of births of all 50 completely different to the other knowledge. Wikipedia conflicts with DBpedia. So, actually, you have search engines pulling from DBpedia, Wikipedia, now, actually, they’re pulling from the web at large as well to fill in gaps. BERT is also in there with featured snippets and answers as well.
Erin Sparks: 00:34:08 All trying to figure out what’s best for the consumer.
Dawn Anderson: 00:34:11 There’s lots of conflicting data, which is giving a lot of the wrong answers. This is the problem. So it’s not that great yet at all.
Erin Sparks: 00:34:21 But the old adage of garbage in, garbage out, we, as users, have actually put that trash out there, so to speak.
Dawn Anderson: 00:34:30 Well, actually it’s often the websites that are putting things out. And, in actual fact, Wikipedia played a joke last year on Google. Wasn’t meant to be a joke, but it was making a point. Do you remember the whole thing about how many legs does horse have?
Erin Sparks: 00:34:48 Yes.
Dawn Anderson: 00:34:48 And [crosstalk 00:34:48] sort of work out whether it had six legs, which was two two full legs on two hind legs, which when you look at four, it could be considered F-O-R-E misspelling, F-O-U-R, and two hind legs, which four and two is six. Wikipedia has a whole page that says, “How many legs does a horse have?” and it basically makes the point, “Look, just because I saw something has six legs, it doesn’t mean it doesn’t have four legs. In other words, don’t believe everything you see is out there.”
Dawn Anderson: 00:35:23 You look at featured snippets, the talk about counties of the U.K. It’s inaccurate, because it pulls information from Wikipedia. So you’ll say, “Which county is Twickenham in?” And it’ll say, “Middlesex.” but in actual fact Middlesex was abolished years ago. Twickenham is in Greater London. Google can’t differentiate between historic counties. The word historic changes everything. So it’s quite a mess really. It’s getting there bit by bit. But the average searcher doesn’t think about these things like we do.
Erin Sparks: 00:36:04 No, but are you also considering that the consumers, digital savvy consumers, they’re actually playing a part in this, and they’re accepting, they’re realizing that there’s an error-handling issue that’s going on, and they’re trusting the engines to have this entity-relationship model and the counter model. And we talked about this with Bill Slawski a while, and he’s just deep into Google patents, bar none, and understanding that along, with BERT, there’s this new entity-relationship and vetting process that’s going on inside of the engines that we’re all privy to, and we’ve set the machine to start digesting all this. And we’re almost … what do you want to say? We’re almost accepting of these naturally occurring errors, because we know it’s going to be purged out of the system eventually. Would you say that or is that too far of a stretch?
Dawn Anderson: 00:37:04 Yeah, I would say that the average searcher believes the search engine. What happens as well is there’s an element of reinforcement learning in that as well. Everything is about probability. Again, it’s going into this deck I’m creating at the moment, actually, so when you put in “Which county is Hounslow in?” Hounslow is actually in Greater London, but, again, Google thinks it’s Middlesex, because it pulls it from Wikipedia, even though Wikipedia actually says it’s in the administrative area of Greater London. It doesn’t understand historic county and that actually Middlesex was abolished years ago. So it doesn’t understand that whole notion of it’s gone.
Dawn Anderson: 00:37:50 What happens is you have people who, for years of their whole lives, used to always type in “Hounslow Middlesex,” and, actually, when you start to type in “Hounslow,” people also ask … the first thing that comes up in the people also ask “Is Hounslow Middlesex?” But then when you also look in people also ask and the questions, you see lots of people asking “Is Middlesex in Twickenham?” Sorry, “Is Hounslow in Middlesex?” Or “Is Middlesex still a county? Is Hounslow in Greater London?” You can see that there’s confusion, but Google still spits out the wrong answer, because of the way that there is this whole notion of semantic heterogeneity, conflicting databases.
Dawn Anderson: 00:38:36 And there’s this notion of equiprobability, where, with all things being equal, whatever we have, everything’s equal, so there’s nothing that’s gaining favor one or the other. But then when people start keeping it alive, because they’re older and they grew up in a time when Hounslow was in Middlesex for all of their lives, you have them reinforcing it: Hounslow Middlesex, Hounslow Middlesex. So it’s that whole popularity of Hounslow Middlesex.
Dawn Anderson: 00:39:08 And also you have websites, websites that build location-based structures, like direct trades, like travel sites, like booking sites, anything with a geolocation aspect. Some of them build their own location structures. Some of them are looking at Wikipedia. Some of them are taking data from post codes. Some of them are taking it from grid reference systems. Some of them are taking it from OpenStreet. All of these sources just cause more and more conflict.
Erin Sparks: 00:39:40 More and more dissonance. And, in fact, I think we’re, just by this conversation, we’re affecting the Hounslow-Middlesex relationship with our transcript. Just kidding.
Erin Sparks: 00:39:51 The point is that there’s an ongoing conversation, and it’s beholden on the users to be able to push accuracy into this space. And, with machine learning, we’re now in this environment … at least finally, we’re in the space where the environment is now rich to be able to listen to accurate contributions from consumers. But you still have this battle of entities. It’s very interesting to watch as a concept starts fighting with itself. It’s almost you’re seeing it live in action, just like your references that you’re seeing the discrepancy of the dissonance, and then ultimately start to fine tune and hone the key concepts.
Dawn Anderson: 00:40:36 Exactly that. Yeah, no, exactly. I think what happens as well as an entity graphs … as knowledge graphs get more and more populated and there’s more data available, the probability seesaws. Everything for me is like a seesaw. The probability seesaw will begin to take more and more in one direction over the other. And, again, as we mentioned, machine learning, just predicting, predicting, predicting, “Well, is this that entity? Is this right?” when in connection with that entity and so forth.
Dawn Anderson: 00:41:12 And Bill Slawki you mentioned earlier, who I’m a great fan of and we spoke together in Milan and we went for coffee and we had some fantastic chats, he shared, only in February of last year, [inaudible 00:41:26] that was looking at how Google is actually now extracting not just from known knowledge repositories, like well respected and established [inaudible 00:41:36] repositories, but just extracting data on entities from the web generally to add to the knowledge graph. And they have all these attributes.
Dawn Anderson: 00:41:45 Attributes are the thing you know. The more attributes there are, the more they can be certain, and the entity determination, it gets easier if you’re like … And, again, now taking it back to the whole BERT thing, entity determination is a big use for BERT as well, understanding more and more in context with what this relates to, what that conversation relates to, and you even the context of a sentence about river. We mention the word deposit … Or, sorry. In the context of a word bank, we mention the word deposit [inaudible 00:42:25] the word river, then we know it’s a financial bank. If we mention water, then probably going to be about a riverbank.
Dawn Anderson: 00:42:34 So you know everything is just about reinforcement and alignment of all of these areas, all of these different types of data. The natural language data, which is a hard challenge, the semi-structure data, which is a bit easier but it’s very easy to over-optimize a website with so many structured [inaudible 00:42:53], because you start to tip the balance too far. It starts to look a bit spammy. And then there’s the very structured data, which is tables, unordered lists, relational databases. You see a lot of those obviously in featured snippets, because they’re easy to extract to address size charts or that kind of thing.
Erin Sparks: 00:43:17 You laid it all at our feet here on the show of all the different pieces of content that Google just engines in general are digesting and be able to understand relationships, understand the ongoing growing list of attributes, ’cause this doesn’t just stop with known factors, it’s learning as it’s going to be able to build additional, additional, discernible attributes that will be able to push away from key concepts and kind of not silo but be able to grow the strength of that and the strength of clarity and truth on that [inaudible 00:43:53].
Erin Sparks: 00:43:53 It gets back to how do we optimize our content towards this ever-learning engine. And I guess I have to post a flag right here regarding relevancy and creating content that is relevant to itself and each other and start creating your own contribution into this learning ecosystem that we have. Is that the key element, the key factor of how we can play in this space?
Dawn Anderson: 00:44:23 Yeah, I think it depends on the site. So if, for instance, I was thinking about. The other day, the [inaudible 00:44:31] I would say that everything a search engine does is built for scale and impact, search impact, even the notions of crawl budget and so forth. Everything is built for what strategy is it going to help the most impact on the most people on the biggest population, and it goes back to Pareto’s Law, the 80 20 rule in everything. 80% of the traffic demand will be probably satisfied by 20% of the sites. 80% of the people will probably visit only 20% of the URLs and so forth. So I think it depends.
Dawn Anderson: 00:45:11 And I would say that some of the biggest websites in the world are actually relatively thin content sites in the main, so they they’re booking engines. They have search engines themselves. You think about a search engine, the page is actually pretty small, pretty thin, on the search engine. If we think, “Well, a search engine is just a website, a big website,” and a lot of them are really thin pages. Well, they probably, apart from the likes of Wikipedia obviously, which is not a thin-paged website. It’s a very huge-paged websites with kind of long-form content.
Dawn Anderson: 00:45:53 So I think it depends. But I think … I also ran a little bit of an experiment this morning, and we had a conversation on Twitter that said, “Hey, who can share a screengrab of the most excluded URLs?” And [inaudible 00:46:05] Google Search comes up.
Erin Sparks: 00:46:07 Yeah, we’ve got a winner. It’s over a billion. Unbelievable.
Dawn Anderson: 00:46:11 The thing is, I don’t know what kind of sites these are, but I would hazard a guess that a lot of them are massive, database-driven engines, booking engines, the likes of travel sites, the likes of huge eCommerce sites, the likes of sites that actually are trying to build huge amounts of transactional pages without the value that they need to add to a positive network effect.
Dawn Anderson: 00:46:40 Have you heard of the network effect where actually –
Erin Sparks: 00:46:44 No. Unpack that for us.
Dawn Anderson: 00:46:47 Actually, it’s a part of statistics. Basically, it’s exponential growth. So you see some sites get bigger, bigger, bigger, bigger, bigger. And these past few years, a lot of quality of [inaudible 00:46:58] seems to come around. But there’s a lot of them have been impacted sites that maybe had a lot of what you would consider thin and [inaudible 00:47:05] content, going back to the old [inaudible 00:47:08]. But to point A, my thought are that, actually, part of the reason is they don’t actually have … a lot of those pages, probably, A, a lot of them are trying to act bigger than that boots. They’re trying to have a bigger footprint online and reach more people than the positive network effect that they contribute. But they don’t actually add the value overall that they’re trying to … they’re trying to be bigger than their boots. They’re walking around in their moms’ shoes.
Dawn Anderson: 00:47:55 Yeah, but also I think a lot of the time a lot of the issue is it’s not that content is thin. It’s the fact that the content doesn’t add value. So, actually, you look on an eCommerce site. I see a lot of people going down the eCommerce route of adding loads of text on fashion pages. There’s no way I’m ever going to say to my client in the luxury fashion space, “Hey, add a lot of SEO text to the transactional pages.” Fashion is driven by images, videos, images, and so forth. You don’t need a lot of text on those transactional pages. What you do need is things that add value.
Dawn Anderson: 00:48:35 So you think about your audience, think about entities, think about what adds value with a product. So it’s things like schema. It’s things like everything that’s on the schema list. It’s price. It’s color. It’s video. It’s reviews. It’s user-generated content. It’s questions. It’s sizing. All of those things add a feature. They add features and they add attributes, and they add value if done well. That’s the concept that a lot of these sites don’t add.
Dawn Anderson: 00:49:04 So it all depends. If you’re trying to gain traction in the long-form content, like a Wikipedia or informational site, then, sure, yeah, you want to add and answer as many questions as you possibly can. But a lot of that is not transactional.
Dawn Anderson: 00:49:23 Well, what you will find, and Ewan, who’s an SEO that we know, an international SEO, on Twitter shared the other week that when an eCommerce client chopped away a lot of informational pages, what was the impact? The transactional pages tanked, because the probability of you being able to help with the transactional is reinforced by actually adding lots and lots of rich informational content. So the two have an impact on each other.
Dawn Anderson: 00:49:59 And I think also there’s a lot of big sites out there that have pruned away old categories, subcategories, probably with loads of user-generated contents on them in the past, not realizing actually that that user-generated content, as long as it’s not spam and hacked comments and others kind of carry one, that add value. That’s actually probably one of the most valuable parts of one of these massive sites that you could imagine, because it’s the natural voice. It’s the real people. It’s part of the network effect, where people actually [inaudible 00:50:32] and wanted to be involved. So somebody hacks them all, gets rid of them all, you’re going to tank, because you actually got rid of all those rich, natural things that actually contributed to your domain ontology overall.
Erin Sparks: 00:50:48 Perfect example would be reviews stacked around a particular product page. Well, that increases the dwell time on that particular page, because the user is actually engaged and they’re reading through other people’s experiences. As soon as you jettison that or God forbid that you actually even fabricate some of that, all of a sudden your engagement will drop. If you jettison that content, all of a sudden it just disappears, and the value is no longer there. So it’s not just content. It’s content based on the intent that –
Dawn Anderson: 00:51:19 And the humans, and the humans. The point is, not withstanding fake reviews, because obviously we know that this is still a big problem out there with fake reviews and bias and huge mess. The reviews are created by a very tiny percentage of the population, which is actually much less than the norm, and the reviews are actually very valuable. Anything created by real, genuine users is valuable, probably as valuable as the content created, carefully crafted by the SEO on the page.
Dawn Anderson: 00:51:54 That is a big shame. I see a lot. I’ve done it myself. I did it by accident on one of my own kind of toy projects if you like, where I thought, “Oh, we’re going to cut back on the categories here,” not realizing that we actually had like 70,000 user-generated content pieces that I just accidentally unattached from tags. So they suddenly became invisible in the site. And it had a disastrous effect, but it was personal projects. So that’s kind of being fixed over time and remapping things. And we’re actually going to be using something like BERT on that to recategorize the natural language and subcategorize those concept pieces back into something that can be logical and useful.
Dawn Anderson: 00:52:49 And also I see a lot people merging things that are topically relevant to try and say, “Oh, wow, we’ll just merge this content.” Well, actually, no, you can end up actually ambiguating the topic more by merging things that are not topically very, very close.
Erin Sparks: 00:53:10 So you got to be careful.
Dawn Anderson: 00:53:11 A very, very fine line with natural-language understanding for sure.
Erin Sparks: 00:53:16 Understood. We’re talking about very large sites, very large repositories of information and user-generated value. Whenever you’re talking to small-, medium-size businesses that realize that NLP is in the ecosystem and they should try to contribute or they should try to optimize towards it. What kind of guidance can you give them, because they certainly don’t have the deep structure of information that you’re referencing here? What are some key thoughts there?
Dawn Anderson: 00:53:46 Okay, so, first and foremost, there’s a reason why it’s called natural language processing, because it’s supposed to be natural. So that’s the thing. So the whole point of don’t optimize the natural-language processing. Because it begins to look fake and phony, and actually it’s pretty easy for search engines to pick up on. So if you start having pieces of content that say, “If you want to buy blue shoes, here’s a blue shoe that’s for your left foot and a blue shoe that’s for your right foot,” and we must make sure that there’s shoe in there so many times and choosing there so many times and footwear in there so many times. It kind of begins to look natural, particularly if you do it to a template, because the pattern is … Search engines are amazing at picking up patterns, and obviously this is machines that are picking of patterns and clustering pattens and so forth.
Dawn Anderson: 00:54:42 The point I would make is there’s a difference between over-optimizing for natural language and therefore tipping things into not natural and actually providing clear and semantic structure to a webpage. So structure is important, particularly on things like a small … I know a lot of smaller sites that have eCommerce structures, categories and the subcategories and so forth. I would be very careful about using things like subcategories on blogs, which typically are very … they’re really just very unstructured and loose anyway. And if you’re doing things on a WordPress site, I’d probably use the pages part of the site to build evergreen content and build it out with a subcategory and category child-parent structure. So that’s what I would do. I would utilize tagging well. Personally, I don’t index type pages. I utilize tags to pull in related content to gather and build a connection in the supplementary content –
Erin Sparks: 00:55:46 For the user.
Dawn Anderson: 00:55:47 – for the user and for the search engine, because it builds out a relevance factor. So that’s the kind of strategies that I would recommend to a smallish site, really, and a bigger site, just everybody, really.
Erin Sparks: 00:56:02 And on top of that, opening up the annals of comments and engaging with users, allowing that natural growth of the the natural patterning of comments, right?
Dawn Anderson: 00:56:14 Yeah. And just the network effect: these big sites, a lot of them grow too massive while they just build these structures much bigger than their value. They’re not adding any value in the informational content. They’re just trying to build this structure that will [inaudible 00:56:33] automatically. But without the users … so actually engage with the users, encourage the users, encourage returning visitors.
Dawn Anderson: 00:56:42 It’s like I see a lot of sites say, “Hey, these are popular products.” But actually they’re trying to tip popularity in favor of their business goals. “This is a popular location in the middle of the outer Hebrides.” How can it be popular? Nobody lives there. It’s because you’re trying to manipulate your internal rankings.
Dawn Anderson: 00:57:06 So popularity and things like Zipfian distribution … Zipfian distribution is easy to measure. All the population of all the cities of all the world go in what they call a Zipfian distribution, so a massive head and a very long tail. So actually popular is probably the top 50 cities of the country, not what you’re trying to manipulate and trying to rank for. So I see those kinds of things.
Dawn Anderson: 00:57:37 So try and utilize strategies that that go with the natural popularity of things and the naturalness of the world.
Erin Sparks: 00:57:48 Understood.
Dawn Anderson: 00:57:49 Does that make sense?
Erin Sparks: 00:57:49 It hearkens back to everything that we’ve ever learned with SEO is that if you try to manipulate it, it’s going to fail. Maybe not immediately, but the engines will catch up to the patterning that you’re doing, the type of overt manipulation. So write organically. Write naturally for the user. It sounds cliche, but it absolutely is the best scenario. Certainly make the interrelationships and leverage the technology to be able to create the linkage, but don’t write … And I was kind of teasing at the beginning of the show, ’cause I was literally going through every question that people say in the Google knowledge panel regarding natural search processing. The fact of the matter, don’t copy those key phrases. Don’t copy those sentences, because that’s actually a feedback loop that you’re not natural. You literally need to have a break there and stop trying to gain the system. Just open it up for consumers and be able to engage at that level, yes?
Dawn Anderson: 00:58:51 Yeah. But also, I would still utilize the question as a heading. Questions people also ask, yes, I would make that a heading. But I wouldn’t have one for everything, because the agenda’s paraphrase. It’s about being sensible, not pushing anything too far. So yeah. And that’s it. What would you not be ashamed of, just think about …
Erin Sparks: 00:59:24 Yeah, exactly.
Dawn Anderson: 00:59:25 Try and step out of our SEO heads, and sometimes it’s quite hard. We type in CamelCase. We have SEO brains.
Erin Sparks: 00:59:36 We’re going to break that sometimes. We have to be able to [schleff 00:59:39] that off, because we … The engines want to learn authentic, and we, as SEOs, practice the optimization too much to the detriment of even the learning engine.
Erin Sparks: 00:59:52 Dawn, it’s been fantastic. You certainly have a wealth of information in this space. We’re eager to listen to you at SMX West. We do need to unfortunately wrap up here, and there’s so much more in this conversation, but we do want to always ask of our guests a couple key questions. What bugs you about your industry right now?
Dawn Anderson: 01:00:17 Oh, what bugs me about my industry?
Erin Sparks: 01:00:22 We’ve talked a lot about some of those old players, but if there’s nothing that comes to mind, don’t worry. Conversely, can you tell us what you’re excited about right now in your industry?
Dawn Anderson: 01:00:34 I’m excited about the progress. I’m excited by the impact that BERT will have. I’m excited about … I mean, BERT, for instance, is just being … One of the key parts of BERT is the transformer element, because it’s bi-directionally coding representations from transformers [crosstalk 01:00:54]
Erin Sparks: 01:00:54 That still doesn’t roll off the tip of the tongue, but you’ve done it. You’ve got it.
Dawn Anderson: 01:00:57 The other day Google just released something called Reformer, which is an extension or a huge expansion and improvement on Transformer, which means that BERT can actually understand a word in the context of something the size of a novel versus a few paragraphs.
Dawn Anderson: 01:01:14 I think we’re going to see hockey-stick movements, and I’m excited for SEO within the marketing space. That is the one thing that bugs me actually. I sometimes feel that marketers out there don’t have SEO at the table of marketing enough, and that’s partly because we exclude ourselves a little bit. I’m a marketer. I’m a marketer that’s interested in the technical aspects of search. And I think a lot of us are like that, but we get overlooked a lot by marketing people, perhaps because of some of the strategies of some people in the past and even today. It bogs us down a little bit. But, generally, I think that marketers should have more respect for SEO has a profession.
Erin Sparks: 01:02:04 Amen to that. Well, is there something that we can promote for you today on the show?
Dawn Anderson: 01:02:11 Well, yeah. I’m speaking at … Well, I’m at BRTY, and my website is BRTY, BRTY.com. And also I’m speaking at SMX West in San Jose, so looking forward to seeing you there and everybody else. And I’m speaking at a few other things as well. Friends of Search in Amsterdam. Just quite a few conferences this year. But I’m not doing a huge amount of travel, ’cause I’ve got to do the work in between. So yeah. I’ll hope to see a few people at SMX West next probably.
Erin Sparks: 01:02:45 It’s going to be a good showing there. Certainly, I know you want to unpack more and more, and we certainly want to hear more about this and natural language processing. Hopefully, you’ll springboard and even have more information at that presentation. I know you’re working on the deck right now.
Dawn Anderson: 01:03:03 Yeah.
Erin Sparks: 01:03:04 All right. Well, we look forward, and we’d love to have you back and talk more about this. There’s nothing that’s not going to ever increase in the natural-language processing, the voice search-processing realm of search. BERT’s getting more and more teeth. There’s more much more information that we need to digest. And we actually do have a role to be able to create accurate and relevant content regularly. So this is the place to encamp, because if you can create some really good content, we’re going to be –
Dawn Anderson: 01:03:35 And keep your content updated.
Erin Sparks: 01:03:37 There it is. Freshen that stuff up on a regular basis. Final thoughts for the digital marketers that are listening and scratching their heads on how they can crack in and benefit from natural-language processing?
Dawn Anderson: 01:03:53 Do a lot of study into your audience. Identify their information needs at every stage. Map it. Think about their timelines. Think about their journeys. Think about their tasks and microtasks. Think about the content that they are interested in at that point, as it could be many, and build it. That’s it.
Erin Sparks: 01:04:13 Just build it. All right.
Erin Sparks: 01:04:14 Well, be sure to check out all the trending news that we covered with Dawn in the bonus podcast episode and the bonus YouTube upload, and those will probably be out tomorrow. You want to make sure you follow Dawn on her Twitter handle, Dawnie Ando, D-A-W-N-I-E A-N-D-O, LinkedIn at MS Dawn Anderson, and Instagram Dawn Ando. Want to make sure that you give Bertey the Pomeranian props, because she knew it was coming a few years prior. Thanks so much to Dawn being on the show, and we certainly want to lift her up and go check her out at SMX West.
Erin Sparks: 01:04:57 We’re going to be having a few people on next week or this coming month. Kim Scott’s going to be on the show on the third. So make sure that you smash that bell on YouTube to get a reminder that we’re going live, and we certainly want to have interaction with our live audiences as well. So for all of us over at Site Strategics and Edge of the Web, thanks for listening, and do not be a piece of cyber driftwood. We’ll talk to you next week. Bye bye.