Extranet module works with Search Server?

Topics: Standard packages, Troubleshooting
Dec 20, 2010 at 7:00 PM
Edited Dec 20, 2010 at 7:03 PM

I am evaluating Composite for use as a portal for technical support. Two of our requirements is that users have to login (requirement met by the Extranet module) and that users can search the site for information (requirement met by the MS Search Server module). Does anyone know if the two modules will work together to prevent a user from seeing a search result to a page they do not have access to? In other words, we do not people who have bought product A to search and be returned results for Product B which they did not buy.

Dec 20, 2010 at 8:00 PM

Since the MS Search Server is a regular crawler, that looks at your html pages and nut the underlying datastores, you would configure it not to be able to login to your extranet, and therefor only get search results for publicly available pages.

If you need it to be able to search pages that requires login as well, you need to give it a full-access account it will use for searching and afterwards you would filter out pages that the current user doesn't have access to. That involves parsing the result for actual links to the pages. Find those pages in your C1 installation and for each check if the user has access to it. If not, the result entry should be removed so it doesn't show up in the final result.

Dec 22, 2010 at 10:50 PM

I work with janvik, and I just wanted to follow-up on this discussion.

I have followed the guides here and here to integrate my search server with composite C1 2.1 Beta. The search service works and crawls pages that are publicly accessible (outside the extranet security).

Since Composite supports the implementation of Search Server, I created a crawl rule to use form credentials to access my Composite site (as this is the way Search Server handles logging into sites that are not based on windows authentication). While it accepts my credentials and saves them, when the full crawl is run search server gives me an access denied error and does not crawl any pages. (yes, the account I am using does have access to all the pages)

Does anyone have experience using search server to crawl extranet protected pages with Composite? burningice implies above that this is possible but I'm not having any success so far.

 

Also, considering janvik's original post, it looks possible to create a custom security timmer which allows you to refine the content displayed to users based on their login credentials, but that will have to wait until I can get search server to crawl protected pages in the first place.

Coordinator
Dec 22, 2010 at 11:36 PM
wesalcock wrote:

.....

Since Composite supports the implementation of Search Server, I created a crawl rule to use form credentials to access my Composite site (as this is the way Search Server handles logging into sites that are not based on windows authentication). While it accepts my credentials and saves them, when the full crawl is run search server gives me an access denied error and does not crawl any pages. (yes, the account I am using does have access to all the pages)

.......

I guess what happens is the following - when you're filling the login form, it creates a cookie that indicates that there's a logged in user. The crawler in the Search Server isn't a browser and it does not send that cookie with in the requests, so extranet module does not receive the authentication information.

What you can do is to create an http module that will be checking response headers, and if, f.e. request are coming from a crawler or/and from a specified IP address (f.e. 127.0.0.1) , it will log in the necessary user automatically.

Coordinator
Dec 22, 2010 at 11:36 PM

I have no practical experience integrating the two, but is sounds like the login either ins't picked up by the Extranet or the authentication cookie doesn't stick. If you edit the extranet user account inside the C1 Console you should see date and time for last activity, last login etc. Looking at those dates should indicate if the Extranet registered a successful login.

In case the issue is that the Search Server in unable to login using the default Extranet login form, you can create an alternative method and use the code below to persist the login (std. ASP.NET Forms Authentication with an encrypted cookie)

System.Web.Security.FormsAuthentication.SetAuthCookie(userNameHere, false);

 

Dec 22, 2010 at 11:45 PM

Just as a quick note I will post the IIS logs that happen when search server tries to hit the site -

2010-12-22 22:46:40 10.0.8.212 POST /Renderers/Page.aspx pageId=6770af10-d922-44a2-b07e-44135fe6d421&cultureInfo=en-US&dataScope=public 80 - 10.0.8.216 Mozilla/4.0+(compatible;+MSIE+4.01;+Windows+NT;+MS+Search+6.0+Robot) 302 0 64 109
2010-12-22 22:46:40 10.0.8.212 POST /Renderers/Page.aspx pageId=6770af10-d922-44a2-b07e-44135fe6d421&cultureInfo=en-US&dataScope=public 80 - 10.0.8.216 Mozilla/4.0+(compatible;+MSIE+4.01;+Windows+NT;+MS+Search+6.0+Robot) 302 0 64 62

 

and this is what happens when a real user logs in -

 

2010-12-22 23:40:42 10.0.8.212 POST /Renderers/Page.aspx pageId=6770af10-d922-44a2-b07e-44135fe6d421&cultureInfo=en-US&dataScope=public 80 - 10.0.0.22 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US;+rv:1.9.2.13)+Gecko/20101203+Firefox/3.6.13 302 0 0 562
2010-12-22 23:40:42 10.0.8.212 GET /Renderers/Page.aspx pageId=6770af10-d922-44a2-b07e-44135fe6d421&cultureInfo=en-US&dataScope=public&login=true 80 wes 10.0.0.22 Mozilla/5.0+(Windows;+U;+Windows+NT+6.1;+en-US;+rv:1.9.2.13)+Gecko/20101203+Firefox/3.6.13 200 0 0 78

the difference is noticable! but what that means, is another thing. I'll take a minute to review your suggestions, thanks.


Dec 23, 2010 at 1:16 AM
Edited Dec 23, 2010 at 1:20 AM

more thoughts:

 

I managed to get search server to crawl secured extranet pages using a cookie specified in the crawl rules.

To get this cookie, I used firefox, logged into my composite site, exported current cookies, and ripped out the cookie relating to my composite site and saved it as a text file to give to search server. 

 

interestingly enough, despite the fact that the secured pages were crawled (I can see them in the crawl logs), they do not turn up in the search results. I'm wondering if search server is already checking who the user making the search is when displaying the results ? 

That, or it's just not actually indexing them properly at all...

 

I'll continue thinking about all this, and what you all said